ramsi-k commited on
Commit
bce4c09
·
1 Parent(s): 52d993c

initial move

Browse files
.gitignore ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Environment and local config
2
+ .env
3
+ config.py
4
+
5
+ # Compiled Python artifacts
6
+ *.pyc
7
+ __pycache__/
8
+ *__pycache__*
9
+
10
+ # Asset folders
11
+ assets/
12
+
13
+ # Storyboard sessions and local DBs
14
+ storyboard/
15
+ memory.db
16
+ *.sqlite3
17
+ *.png
18
+ services/__pycache__/message_factory.cpython-311.pyc
LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Ramsi Kalia
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
README.md CHANGED
@@ -1,14 +1,171 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- title: Agentic Comic Generator
3
- emoji: 🦀
4
- colorFrom: indigo
5
- colorTo: yellow
6
- sdk: gradio
7
- sdk_version: 5.33.1
8
- app_file: app.py
9
- pinned: false
10
- license: mit
11
- short_description: A multi-agent system comic generator
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ---
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Agentic Comic Generator
2
+
3
+ ![Python](https://img.shields.io/badge/language-python-blue)
4
+ ![Gradio](https://img.shields.io/badge/frontend-Gradio-orange)
5
+ ![Modal](https://img.shields.io/badge/backend-Modal-lightgrey)
6
+
7
+ > 🎨 Multi-agent AI system for generating comic panels from story prompts
8
+
9
+ A multi-agent AI system that transforms user prompts into illustrated comic panels. Agent Brown handles narrative logic and dialogue. Agent Bayko renders the visuals. Designed as an experiment in agent collaboration, creative storytelling, and generative visuals.
10
+
11
+ ## 🎗️Key Features
12
+
13
+ - Modular agents for dialogue and image generation
14
+ - Prompt-to-panel storytelling pipeline
15
+ - Gradio-powered web interface
16
+ - Easily extendable for TTS, styles, or emotion tagging
17
+
18
+ ## ✍️ Status
19
+
20
+ Currently under active development for experimentation and portfolio.
21
+
22
+ ## 📁 Directory Structure
23
+
24
+ ```text
25
+ project-root/
26
+ ├── app.py # Entrypoint for Gradio
27
+ ├── api/ # FastAPI routes and logic
28
+ ├── agents/
29
+ │ ├── brown.py
30
+ │ └── bayko.py
31
+ ├── plugins/
32
+ │ ├── base.py
33
+ │ └── tts_plugin.py
34
+ ├── services/
35
+ │ └── ai_service.py
36
+ ├── config.py
37
+ ├── modal_app.py
38
+ ├── storyboard/ # Where all output sessions go
39
+ │ └── session_xxx/
40
+ ├── requirements.txt
41
+ ├── README.md
42
+ └── tech_specs.md
43
+ ```
44
+
45
+ ## 💡 Use Case
46
+
47
+ A user enters a storytelling prompt via a secure WebUI.
48
+ The system responds with:
49
+
50
+ - Stylized dialogue
51
+ - Rendered comic panels
52
+ - Optional voiceover narration
53
+
54
+ Behind the scenes, two agents — Bayko and Brown — process and generate the comic collaboratively while remaining isolated via network boundaries.
55
+
56
  ---
57
+
58
+ ## 📞 Agent Communication & Storage
59
+
60
+ ## 👥 Agent Roles
61
+
62
+ Two core agents form the backbone of this system:
63
+
64
+ - 🤖 **Agent Brown** – The front-facing orchestrator. It receives the user’s prompt, tags the style, validates inputs, and packages the story plan for execution.
65
+ - 🧠 **Agent Bayko** – The creative engine. It handles image, audio, and subtitle generation based on the structured story plan from Brown.
66
+
67
+ Each agent operates in isolation but contributes to the shared goal of generating cohesive, stylized comic outputs.
68
+
69
+ ### Agent Brown
70
+
71
+ - 🔹 Input validator, formatter, and storyboard author
72
+ - ✨ Adds style tags ("Ghibli", "tragedy", etc.)
73
+ - 📦 Writes JSON packages for Bayko
74
+ - 🛡️ Includes moderation tools, profanity filter
75
+
76
+ ### Agent Bayko
77
+
78
+ - 🧠 Reads storyboard.json and routes via MCP
79
+ - 🛠️ Toolchain orchestration (SDXL, TTS, Subtitler)
80
+ - 🎞️ Output assembly logic
81
+ - 🔄 Writes final output + metadata
82
+
83
+ Brown and Bayko operate in a feedback loop, refining outputs collaboratively across multiple turns, simulating human editorial workflows.
84
+
85
+ ## 🔁 Agent Feedback Loop
86
+
87
+ This system features a multi-turn agent interaction flow, where Brown and Bayko collaborate via structured JSON messaging.
88
+
89
+ ### Step-by-Step Collaboration
90
+
91
+ 1. **User submits prompt via WebUI**
92
+ → Brown tags style, checks profanity, and prepares a `storyboard.json`.
93
+
94
+ 2. **Brown sends JSON to Bayko via shared storage**
95
+ → Includes panel count, style tags, narration request, and subtitles config.
96
+
97
+ 3. **Bayko processes each panel sequentially**
98
+ → For each, it generates:
99
+
100
+ - `panel_X.png` (image)
101
+ - `panel_X.mp3` (narration)
102
+ - `panel_X.vtt` (subtitles)
103
+
104
+ 4. **Brown reviews Bayko’s output against the prompt**
105
+
106
+ - If all panels match: compile final comic.
107
+ - If mismatch: returns annotated JSON with `refinement_request`.
108
+
109
+ 5. **UI reflects agent decisions**
110
+ → Shows messages like “Waiting on Bayko…” or “Refining… hang tight!”
111
+
112
+ This feedback loop allows for **multi-turn refinement**, **moderation hooks**, and extensibility (like emotion tagging or memory-based rejections).
113
+
114
+ ### User Interaction
115
+
116
+ - When the user submits a prompt, the system enters a "processing" state.
117
+ - If Brown flags an issue, the UI displays a message such as “Refining content… please wait.”
118
+ - This feedback loop can be extended for multi-turn interactions, allowing further refinement for higher-quality outputs.
119
+
120
+ This modular design not only demonstrates the agentic behavior of the system but also allows for future expansions such as incorporating memory and adaptive feedback over multiple turns.
121
+
122
+ ## ⚙️ Example Prompt
123
+
124
+ ```text
125
+ Prompt: “A moody K-pop idol finds a puppy on the street. It changes everything.”
126
+ Style: 4-panel, Studio Ghibli, whisper-soft lighting
127
+ Language: Korean with English subtitles
128
+ Extras: Narration + backing music
129
+ ```
130
+
131
+ For detailed multi-turn logic and JSON schemas, see [Feedback Loop Implementation](./tech_specs.md#-multi-turn-agent-communication).
132
+
133
  ---
134
 
135
+ ## 🧠 System Architecture
136
+
137
+ ### 🏗️ Technical Overview
138
+
139
+ The system combines **FastAPI** backend services, **Gradio** frontend, **Modal** compute scaling, and **LlamaIndex** agent orchestration to create a sophisticated multi-agent workflow.
140
+
141
+ ```mermaid
142
+ graph TD
143
+ A[👤 User Input<br/>Gradio Interface] --> B[🤖 Agent Brown<br/>Orchestrator]
144
+ B --> C[🧠 LlamaIndex<br/>Memory & State]
145
+ B --> D[📨 JSON Message Queue<br/>Agent Communication]
146
+ D --> E[🎨 Agent Bayko<br/>Content Generator]
147
+ E --> F[☁️ Modal Inference<br/>Compute Layer]
148
+
149
+ subgraph "🎯 Sponsor Tool Integration"
150
+ G[🤖 OpenAI API<br/>Dialogue Generation]
151
+ H[🦙 Mistral API<br/>Style & Tone]
152
+ I[🤗 HuggingFace<br/>SDXL Models]
153
+ J[⚡ Modal Labs<br/>Serverless Compute]
154
+ end
155
+
156
+ F --> G
157
+ F --> H
158
+ F --> I
159
+ E --> J
160
+
161
+ E --> K[✅ Content Validation]
162
+ K --> L{Quality Check}
163
+ L -->|❌ Needs Refinement| D
164
+ L -->|✅ Approved| M[📦 Final Assembly]
165
+ M --> N[🎨 Comic Output<br/>Gradio Display]
166
+
167
+ style A fill:#e1f5fe
168
+ style B fill:#f3e5f5
169
+ style E fill:#e8f5e8
170
+ style F fill:#fff3e0
171
+ ```
agents/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """Bayko agents package"""
agents/bayko.py ADDED
@@ -0,0 +1,1017 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Agent Bayko - The Creative Engine
3
+
4
+ Agent Bayko is the content generation specialist that handles:
5
+ - Reading structured JSON requests from Agent Brown
6
+ - Generating comic panel images using SDXL via Modal compute
7
+ - Creating audio narration using TTS services
8
+ - Generating subtitle files in VTT format
9
+ - Managing output files in the session directory structure
10
+ - Updating metadata as content is generated
11
+ - Supporting refinement requests from Brown's feedback loop
12
+ """
13
+
14
+ import uuid
15
+ import json
16
+ import time
17
+ import os
18
+ import logging
19
+ from datetime import datetime
20
+ from typing import Dict, List, Optional, Any, Tuple, Union, cast
21
+ from pathlib import Path
22
+ from dataclasses import dataclass, asdict, field
23
+ from enum import Enum
24
+
25
+ from openai import OpenAI
26
+ from openai.types.chat import ChatCompletion
27
+ from llama_index.llms.openai import OpenAI as LlamaOpenAI
28
+
29
+ # Core services
30
+ from services.unified_memory import AgentMemory
31
+ from services.session_manager import SessionManager as ServiceSessionManager
32
+ from services.message_factory import MessageFactory, AgentMessage, MessageType
33
+
34
+ # Tools
35
+ from agents.bayko_tools import ModalImageGenerator, ModalCodeExecutor
36
+
37
+ # Configure logging
38
+ logging.basicConfig(level=logging.INFO)
39
+ logger = logging.getLogger(__name__)
40
+
41
+
42
+ class GenerationStatus(Enum):
43
+ """Status of content generation"""
44
+
45
+ PENDING = "pending"
46
+ IN_PROGRESS = "in_progress"
47
+ COMPLETED = "completed"
48
+ FAILED = "failed"
49
+ REFINEMENT_NEEDED = "refinement_needed"
50
+
51
+
52
+ class ContentType(Enum):
53
+ """Types of content Bayko can generate"""
54
+
55
+ IMAGE = "image"
56
+ AUDIO = "audio"
57
+ SUBTITLES = "subtitles"
58
+ METADATA = "metadata"
59
+
60
+
61
+ @dataclass
62
+ class PanelContent:
63
+ """Content and metadata for a single comic panel"""
64
+
65
+ panel_id: int
66
+ description: str
67
+ enhanced_prompt: str = "" # LLM-enhanced prompt
68
+ image_path: Optional[str] = None
69
+ image_url: Optional[str] = None
70
+ audio_path: Optional[str] = None
71
+ subtitles_path: Optional[str] = None
72
+ status: str = "pending"
73
+ generation_time: float = 0.0
74
+ style_tags: List[str] = field(default_factory=list)
75
+ errors: List[str] = field(default_factory=list)
76
+ refinement_history: List[Dict[str, Any]] = field(default_factory=list)
77
+
78
+ def to_dict(self) -> Dict[str, Any]:
79
+ return asdict(self)
80
+
81
+
82
+ @dataclass
83
+ class GenerationResult:
84
+ """Result of a generation or refinement request"""
85
+
86
+ session_id: str
87
+ panels: List[PanelContent]
88
+ metadata: Dict[str, Any]
89
+ status: GenerationStatus
90
+ total_time: float
91
+ errors: List[str] = field(default_factory=list)
92
+ refinement_applied: bool = False
93
+
94
+ def to_dict(self) -> Dict[str, Any]:
95
+ return {
96
+ "session_id": self.session_id,
97
+ "panels": [panel.to_dict() for panel in self.panels],
98
+ "metadata": self.metadata,
99
+ "status": self.status.value,
100
+ "total_time": self.total_time,
101
+ "errors": self.errors,
102
+ "refinement_applied": self.refinement_applied,
103
+ }
104
+
105
+
106
+ class AgentBayko:
107
+ """
108
+ Agent Bayko - The Creative Engine
109
+
110
+ Main responsibilities:
111
+ - Process structured requests from Agent Brown
112
+ - Generate comic panel images via SDXL/Modal
113
+ - Create audio narration using TTS
114
+ - Generate subtitle files in VTT format
115
+ - Manage session file organization
116
+ - Support refinement requests and feedback loops
117
+ - Update metadata and performance metrics
118
+ """
119
+
120
+ def __init__(self, llm: Optional[LlamaOpenAI] = None):
121
+ # Core tools
122
+ self.image_generator = ModalImageGenerator()
123
+ self.code_executor = ModalCodeExecutor()
124
+ self.llm = llm
125
+
126
+ # Session state
127
+ self.current_session: Optional[str] = None
128
+ self.session_manager: Optional[ServiceSessionManager] = None
129
+ self.memory: Optional[AgentMemory] = None
130
+ self.message_factory: Optional[MessageFactory] = None
131
+
132
+ # Stats tracking
133
+ self.generation_stats = {
134
+ "panels_generated": 0,
135
+ "refinements_applied": 0,
136
+ "total_time": 0.0,
137
+ "errors": [],
138
+ }
139
+
140
+ async def process_generation_request(self, message: Any) -> AgentMessage:
141
+ """
142
+ Process generation request from Agent Brown
143
+
144
+ Args:
145
+ message: Request from Brown (can be direct dict or wrapped message)
146
+
147
+ Returns:
148
+ AgentMessage with generated content and metadata
149
+ """
150
+ start_time = time.time()
151
+
152
+ # Handle different input types from Brown
153
+ if isinstance(message, list):
154
+ # If it's a list, it's probably a chat history - extract the last user message
155
+ if message and hasattr(message[-1], "content"):
156
+ # Try to parse JSON from the content
157
+ try:
158
+ import json
159
+
160
+ payload = json.loads(message[-1].content)
161
+ session_id = payload.get("session_id", "hackathon_session")
162
+ except:
163
+ # Fallback - create basic payload
164
+ payload = {
165
+ "prompt": (
166
+ str(message[-1].content)
167
+ if message
168
+ else "Generate a comic"
169
+ ),
170
+ "session_id": "hackathon_session",
171
+ }
172
+ session_id = "hackathon_session"
173
+ else:
174
+ # Empty list fallback
175
+ payload = {
176
+ "prompt": "Generate a comic",
177
+ "session_id": "hackathon_session",
178
+ }
179
+ session_id = "hackathon_session"
180
+ elif isinstance(message, dict):
181
+ # Handle both direct dict from Brown and wrapped message formats
182
+ if "payload" in message:
183
+ # Wrapped message format
184
+ payload = message.get("payload", {})
185
+ context = message.get("context", {})
186
+ session_id = context.get("session_id", "hackathon_session")
187
+ else:
188
+ # Direct dict from Brown workflow
189
+ payload = message
190
+ session_id = message.get("session_id", "hackathon_session")
191
+ else:
192
+ # Fallback for any other type
193
+ payload = {
194
+ "prompt": str(message),
195
+ "session_id": "hackathon_session",
196
+ }
197
+ session_id = "hackathon_session"
198
+
199
+ if not session_id:
200
+ session_id = "hackathon_session"
201
+
202
+ # Initialize session
203
+ self._initialize_session(session_id, None)
204
+ logger.info(f"Processing generation request for session {session_id}")
205
+
206
+ # Extract generation parameters - handle both Brown formats
207
+ prompt = payload.get("enhanced_prompt") or payload.get("prompt", "")
208
+ original_prompt = payload.get("original_prompt", prompt)
209
+ style_tags = payload.get("style_tags", [])
210
+ panel_count = payload.get("panels", 4)
211
+
212
+ # Create panel descriptions (using Brown's enhanced prompt for now)
213
+ panel_prompts = self._create_panel_descriptions(
214
+ prompt, panel_count, payload.get("style_config", {})
215
+ )
216
+
217
+ # Generate content for each panel in parallel
218
+ panels = []
219
+ errors = []
220
+
221
+ for i, description in enumerate(panel_prompts, 1):
222
+ try:
223
+ panel = await self._generate_panel_content(
224
+ panel_id=i,
225
+ description=description,
226
+ enhanced_prompt=prompt, # Using Brown's enhanced prompt directly
227
+ style_tags=style_tags,
228
+ language="english", # Default for now
229
+ extras=[], # Simplified for now
230
+ session_id=session_id,
231
+ )
232
+ panels.append(panel)
233
+ self._update_generation_progress(i, panel_count)
234
+
235
+ except Exception as e:
236
+ error_msg = f"Failed to generate panel {i}: {str(e)}"
237
+ logger.error(error_msg)
238
+ errors.append(error_msg)
239
+ panels.append(
240
+ PanelContent(
241
+ panel_id=i, description=description, errors=[error_msg]
242
+ )
243
+ )
244
+
245
+ # Calculate total time
246
+ total_time = time.time() - start_time
247
+
248
+ # Create metadata
249
+ metadata = self._create_generation_metadata(
250
+ payload, panels, total_time, errors
251
+ )
252
+
253
+ # Update session metadata
254
+ self.update_metadata(metadata)
255
+ self._save_current_state(message, panels, metadata)
256
+
257
+ # Create result object
258
+ result = GenerationResult(
259
+ session_id=session_id,
260
+ panels=panels,
261
+ metadata=metadata,
262
+ status=(
263
+ GenerationStatus.COMPLETED
264
+ if not errors
265
+ else GenerationStatus.FAILED
266
+ ),
267
+ total_time=total_time,
268
+ errors=errors,
269
+ )
270
+
271
+ # Create response message
272
+ if self.memory:
273
+ self.memory.add_message(
274
+ "assistant",
275
+ f"Generated {len(panels)} panels in {total_time:.2f}s",
276
+ )
277
+
278
+ if self.message_factory:
279
+ return self.message_factory.create_approval_message(
280
+ result.to_dict(),
281
+ {
282
+ "overall_score": 1.0 if not errors else 0.7,
283
+ "generation_successful": True,
284
+ "panels_generated": len(panels),
285
+ "total_time": total_time,
286
+ },
287
+ 1, # Initial iteration
288
+ )
289
+
290
+ # Create a plain message if no factory available
291
+ plain_message = AgentMessage(
292
+ message_id=f"msg_{uuid.uuid4().hex[:8]}",
293
+ timestamp=datetime.utcnow().isoformat() + "Z",
294
+ sender="agent_bayko",
295
+ recipient="agent_brown",
296
+ message_type="generation_response",
297
+ payload=result.to_dict(),
298
+ context=context,
299
+ )
300
+
301
+ return plain_message
302
+
303
+ async def process_refinement_request(
304
+ self, message: Dict[str, Any]
305
+ ) -> AgentMessage:
306
+ """Process refinement request from Agent Brown"""
307
+
308
+ start_time = time.time()
309
+
310
+ # Extract request data
311
+ payload = message.get("payload", {})
312
+ context = message.get("context", {})
313
+ session_id = context.get("session_id")
314
+ iteration = context.get("iteration", 1)
315
+
316
+ if not session_id:
317
+ raise ValueError("No session_id provided in message context")
318
+
319
+ # Initialize session if needed
320
+ if self.current_session != session_id:
321
+ self._initialize_session(
322
+ session_id, context.get("conversation_id")
323
+ )
324
+
325
+ # Extract refinement data
326
+ original_content = payload.get("original_content", {})
327
+ feedback = payload.get("feedback", {})
328
+ focus_areas = payload.get("focus_areas", [])
329
+ refinements = payload.get("specific_improvements", [])
330
+
331
+ # Log refinement request
332
+ if self.memory:
333
+ self.memory.add_message(
334
+ "user",
335
+ f"Refinement request received - Areas: {', '.join(focus_areas)}",
336
+ )
337
+
338
+ # Generate new panels for ones needing refinement
339
+ panels = []
340
+ errors = []
341
+
342
+ # Convert original panels to PanelContent objects
343
+ original_panels = [
344
+ PanelContent(
345
+ panel_id=p.get("panel_id"),
346
+ description=p.get("description", ""),
347
+ enhanced_prompt=p.get("enhanced_prompt", ""),
348
+ image_path=p.get("image_path"),
349
+ image_url=p.get("image_url"),
350
+ style_tags=p.get("style_tags", []),
351
+ status=p.get("status", "pending"),
352
+ generation_time=p.get("generation_time", 0.0),
353
+ errors=p.get("errors", []),
354
+ )
355
+ for p in original_content.get("panels", [])
356
+ ]
357
+
358
+ # Process each panel
359
+ for panel in original_panels:
360
+ try:
361
+ if any(
362
+ area in focus_areas
363
+ for area in ["visual_quality", "style_consistency"]
364
+ ):
365
+ # Step 1: Improve prompt based on feedback
366
+ improved_prompt = self._improve_prompt_with_feedback(
367
+ original_prompt=panel.enhanced_prompt,
368
+ feedback=feedback,
369
+ improvements=refinements,
370
+ )
371
+
372
+ # Step 2: Generate new panel with improved prompt
373
+ refined_panel = await self._generate_panel_content(
374
+ panel_id=panel.panel_id,
375
+ description=panel.description,
376
+ enhanced_prompt=improved_prompt, # Use improved prompt
377
+ style_tags=panel.style_tags,
378
+ language="english",
379
+ extras=[],
380
+ session_id=session_id,
381
+ )
382
+
383
+ # Add refinement history
384
+ refined_panel.refinement_history.append(
385
+ {
386
+ "iteration": iteration,
387
+ "feedback": feedback,
388
+ "improvements": refinements,
389
+ "original_prompt": panel.enhanced_prompt,
390
+ "refined_prompt": improved_prompt,
391
+ }
392
+ )
393
+ panels.append(refined_panel)
394
+ else:
395
+ # Keep original panel
396
+ panels.append(panel)
397
+
398
+ except Exception as e:
399
+ error_msg = (
400
+ f"Failed to refine panel {panel.panel_id}: {str(e)}"
401
+ )
402
+ logger.error(error_msg)
403
+ errors.append(error_msg)
404
+ panels.append(panel) # Keep original on error
405
+
406
+ total_time = time.time() - start_time
407
+
408
+ # Create metadata
409
+ metadata = {
410
+ "refinement": {
411
+ "iteration": iteration,
412
+ "feedback": feedback,
413
+ "improvements": refinements,
414
+ "focus_areas": focus_areas,
415
+ "panels_refined": len(
416
+ [p for p in panels if len(p.refinement_history) > 0]
417
+ ),
418
+ "total_time": total_time,
419
+ },
420
+ "timestamp": datetime.utcnow().isoformat() + "Z",
421
+ }
422
+
423
+ # Save state
424
+ self._save_current_state(message, panels, metadata)
425
+
426
+ # Create result
427
+ result = GenerationResult(
428
+ session_id=session_id,
429
+ panels=panels,
430
+ metadata=metadata,
431
+ status=(
432
+ GenerationStatus.COMPLETED
433
+ if not errors
434
+ else GenerationStatus.FAILED
435
+ ),
436
+ total_time=total_time,
437
+ errors=errors,
438
+ refinement_applied=True,
439
+ )
440
+
441
+ # Log completion
442
+ if self.memory:
443
+ self.memory.add_message(
444
+ "assistant",
445
+ f"Refined {len([p for p in panels if len(p.refinement_history) > 0])} panels in {total_time:.2f}s",
446
+ )
447
+
448
+ # Create response message
449
+ if self.message_factory:
450
+ return self.message_factory.create_approval_message(
451
+ result.to_dict(),
452
+ {
453
+ "overall_score": 1.0 if not errors else 0.7,
454
+ "refinement_successful": True,
455
+ "panels_refined": len(
456
+ [p for p in panels if len(p.refinement_history) > 0]
457
+ ),
458
+ "total_time": total_time,
459
+ "iteration": iteration,
460
+ },
461
+ iteration,
462
+ )
463
+
464
+ # Fallback to direct response if no message factory
465
+ return AgentMessage(
466
+ message_id=f"msg_{uuid.uuid4().hex[:8]}",
467
+ timestamp=datetime.utcnow().isoformat() + "Z",
468
+ sender="agent_bayko",
469
+ recipient="agent_brown",
470
+ message_type="refinement_response",
471
+ payload=result.to_dict(),
472
+ context=context,
473
+ )
474
+
475
+ def _create_panel_descriptions(
476
+ self, prompt: str, panel_count: int, style_config: Dict[str, Any]
477
+ ) -> List[str]:
478
+ """
479
+ Use mini LLM to break down the story prompt into panel descriptions
480
+
481
+ Args:
482
+ prompt: The main story prompt
483
+ panel_count: Number of panels to generate
484
+ style_config: Style configuration for the panels
485
+
486
+ Returns:
487
+ List of panel descriptions
488
+ """
489
+ if not self.llm:
490
+ # Fallback to basic panel descriptions if LLM is not initialized
491
+ logger.warning("LLM not available, using basic panel descriptions")
492
+ return [
493
+ f"{prompt} (Panel {i+1} of {panel_count})"
494
+ for i in range(panel_count)
495
+ ]
496
+
497
+ try:
498
+ system_prompt = f"""You are a comic storyboarding expert. Break down this story into {panel_count} engaging and visually interesting sequential panels.
499
+ For each panel:
500
+ - Focus on key narrative moments
501
+ - Include visual composition guidance
502
+ - Maintain continuity between panels
503
+ - Consider dramatic timing and pacing
504
+ DO NOT include panel numbers or "Panel X:" prefixes.
505
+ Return ONLY the panel descriptions, one per line.
506
+ Style notes: {json.dumps(style_config, indent=2)}"""
507
+
508
+ client = OpenAI()
509
+ response = client.chat.completions.create(
510
+ model="gpt-3.5-turbo",
511
+ messages=[
512
+ {"role": "system", "content": system_prompt},
513
+ {"role": "user", "content": prompt},
514
+ ],
515
+ temperature=0.7,
516
+ max_tokens=1000,
517
+ )
518
+ # Extract panel descriptions
519
+ descriptions = []
520
+ if response.choices[0].message.content:
521
+ # Split on newlines and filter out empty lines
522
+ descriptions = [
523
+ d.strip()
524
+ for d in response.choices[0].message.content.split("\n")
525
+ if d.strip()
526
+ ]
527
+
528
+ # Ensure we have exactly panel_count descriptions
529
+ if len(descriptions) < panel_count:
530
+ # Pad with basic descriptions if needed
531
+ descriptions.extend(
532
+ [
533
+ f"{prompt} (Panel {i+1})"
534
+ for i in range(len(descriptions), panel_count)
535
+ ]
536
+ )
537
+ elif len(descriptions) > panel_count:
538
+ # Trim extra descriptions
539
+ descriptions = descriptions[:panel_count]
540
+
541
+ return descriptions
542
+
543
+ except Exception as e:
544
+ logger.error(f"Failed to create panel descriptions: {e}")
545
+ # Fallback to basic descriptions
546
+ return [
547
+ f"{prompt} (Panel {i+1} of {panel_count})"
548
+ for i in range(panel_count)
549
+ ]
550
+
551
+ def generate_prompt_from_description(
552
+ self,
553
+ description: str,
554
+ style_tags: List[str],
555
+ mood: Optional[str] = None,
556
+ ) -> str:
557
+ """
558
+ Use mini LLM to convert panel descriptions into SDXL-optimized prompts
559
+
560
+ Args:
561
+ description: The panel description to optimize
562
+ style_tags: List of style tags to incorporate
563
+
564
+ Returns:
565
+ SDXL-optimized prompt
566
+ """
567
+ if not self.llm:
568
+ # Fallback to basic prompt if LLM is not initialized
569
+ style_str = ", ".join(style_tags) if style_tags else ""
570
+ return f"{description} {style_str}".strip()
571
+
572
+ try:
573
+ system_prompt = """You are an expert at crafting prompts for SDXL image generation.
574
+ Convert the given panel description into an optimized SDXL prompt that will generate a high-quality comic panel.
575
+ Follow these guidelines:
576
+ - Be specific about visual elements and composition
577
+ - Maintain artistic consistency with the provided style
578
+ - Use clear, direct language that SDXL will understand
579
+ - Focus on key details that drive the narrative
580
+ Return ONLY the optimized prompt text with no additional formatting."""
581
+
582
+ style_context = ""
583
+ if style_tags:
584
+ style_context = (
585
+ f"\nStyle requirements: {', '.join(style_tags)}"
586
+ )
587
+
588
+ client = OpenAI()
589
+ response = client.chat.completions.create(
590
+ model="gpt-3.5-turbo",
591
+ messages=[
592
+ {"role": "system", "content": system_prompt},
593
+ {
594
+ "role": "user",
595
+ "content": f"{description}{style_context}",
596
+ },
597
+ ],
598
+ temperature=0.5,
599
+ max_tokens=500,
600
+ )
601
+
602
+ if response.choices[0].message.content:
603
+ return response.choices[0].message.content.strip()
604
+
605
+ # Fallback if no valid response
606
+ style_str = ", ".join(style_tags) if style_tags else ""
607
+ return f"{description} {style_str}".strip()
608
+
609
+ except Exception as e:
610
+ logger.error(f"Failed to optimize prompt: {e}")
611
+ # Fallback to basic prompt
612
+ style_str = ", ".join(style_tags) if style_tags else ""
613
+ return f"{description} {style_str}".strip()
614
+
615
+ def _update_generation_progress(
616
+ self, current_panel: int, total_panels: int
617
+ ) -> None:
618
+ """Update generation progress tracking"""
619
+ progress = (current_panel / total_panels) * 100
620
+ logger.info(
621
+ f"Generation Progress: {progress:.1f}% ({current_panel}/{total_panels})"
622
+ )
623
+
624
+ self.generation_stats["panels_generated"] += 1
625
+
626
+ async def _generate_panel_content(
627
+ self,
628
+ panel_id: int,
629
+ description: str,
630
+ enhanced_prompt: str,
631
+ style_tags: List[str],
632
+ language: str = "english",
633
+ extras: Optional[List[str]] = None,
634
+ session_id: Optional[str] = None,
635
+ ) -> PanelContent:
636
+ """
637
+ Generate content for a single panel
638
+
639
+ Args:
640
+ panel_id: Panel identifier
641
+ description: Raw panel description
642
+ enhanced_prompt: Initial enhanced prompt
643
+ style_tags: Style tags to apply
644
+ language: Language for text (default: english)
645
+ extras: Additional generation parameters
646
+ session_id: Current session ID
647
+
648
+ Returns:
649
+ PanelContent object with generated content
650
+ """
651
+ start_time = time.time()
652
+ extras = extras or []
653
+
654
+ try:
655
+ # Step 1: Optimize prompt for SDXL using LLM
656
+ optimized_prompt = self.generate_prompt_from_description(
657
+ description=description, style_tags=style_tags
658
+ )
659
+ # Step 2: Generate image using optimized prompt
660
+ generated_path, gen_time = (
661
+ await self.image_generator.generate_panel_image(
662
+ prompt=optimized_prompt,
663
+ style_tags=style_tags,
664
+ panel_id=panel_id,
665
+ session_id=session_id if session_id is not None else "",
666
+ )
667
+ )
668
+
669
+ # Step 3: Create panel content object
670
+ panel = PanelContent(
671
+ panel_id=panel_id,
672
+ description=description,
673
+ enhanced_prompt=optimized_prompt, # Store the optimized prompt
674
+ image_path=generated_path,
675
+ image_url=(
676
+ f"file://{generated_path}" if generated_path else None
677
+ ),
678
+ style_tags=style_tags,
679
+ status=GenerationStatus.COMPLETED.value,
680
+ generation_time=gen_time,
681
+ )
682
+
683
+ return panel
684
+
685
+ except Exception as e:
686
+ error_msg = f"Panel {panel_id} generation failed: {str(e)}"
687
+ logger.error(error_msg)
688
+
689
+ return PanelContent(
690
+ panel_id=panel_id,
691
+ description=description,
692
+ enhanced_prompt=enhanced_prompt,
693
+ style_tags=style_tags,
694
+ status=GenerationStatus.FAILED.value,
695
+ generation_time=time.time() - start_time,
696
+ errors=[error_msg],
697
+ )
698
+
699
+ def _create_generation_metadata(
700
+ self,
701
+ payload: Dict[str, Any],
702
+ panels: List[PanelContent],
703
+ total_time: float,
704
+ errors: List[str],
705
+ ) -> Dict[str, Any]:
706
+ """Create metadata for generation request"""
707
+ return {
708
+ "request": {
709
+ "prompt": payload.get("prompt", ""),
710
+ "style_tags": payload.get("style_tags", []),
711
+ "panels": len(panels),
712
+ },
713
+ "generation": {
714
+ "total_time": total_time,
715
+ "panels_completed": len(
716
+ [p for p in panels if p.status == "completed"]
717
+ ),
718
+ "panels_failed": len(
719
+ [p for p in panels if p.status == "failed"]
720
+ ),
721
+ "errors": errors,
722
+ },
723
+ "timestamp": datetime.utcnow().isoformat() + "Z",
724
+ }
725
+
726
+ def _save_current_state(
727
+ self,
728
+ message: Dict[str, Any],
729
+ panels: List[PanelContent],
730
+ metadata: Dict[str, Any],
731
+ ) -> None:
732
+ """Save current state to session storage"""
733
+ if not self.session_manager:
734
+ return
735
+
736
+ state_data = {
737
+ "current_message": message,
738
+ "panels": [panel.to_dict() for panel in panels],
739
+ "metadata": metadata,
740
+ "timestamp": datetime.utcnow().isoformat() + "Z",
741
+ }
742
+
743
+ state_file = Path(
744
+ f"storyboard/{self.current_session}/agents/bayko_state.json"
745
+ )
746
+ state_file.parent.mkdir(parents=True, exist_ok=True)
747
+
748
+ with open(state_file, "w") as f:
749
+ json.dump(state_data, f, indent=2)
750
+
751
+ def get_generation_stats(self) -> Dict[str, Any]:
752
+ """Get current generation statistics"""
753
+ return self.generation_stats.copy()
754
+
755
+ def _initialize_session(
756
+ self, session_id: str, conversation_id: Optional[str] = None
757
+ ):
758
+ """Initialize session with unified memory system (matches Brown's pattern)"""
759
+ self.current_session = session_id
760
+ conversation_id = conversation_id or f"conv_{session_id}"
761
+
762
+ # Initialize session-specific services (matches Brown's pattern)
763
+ self.session_manager = ServiceSessionManager(
764
+ session_id, conversation_id
765
+ )
766
+ self.message_factory = MessageFactory(session_id, conversation_id)
767
+ self.memory = AgentMemory(session_id, "bayko")
768
+
769
+ # Create Bayko's directory structure
770
+ self._create_session_directories(session_id)
771
+
772
+ print(f"🧠 Bayko initialized unified memory for session {session_id}")
773
+ logger.info(f"Bayko session initialized: {session_id}")
774
+
775
+ def _create_session_directories(self, session_id: str):
776
+ """Create Bayko's session directory structure"""
777
+ session_dir = Path(f"storyboard/{session_id}")
778
+ content_dir = session_dir / "content"
779
+ agents_dir = session_dir / "agents"
780
+ iterations_dir = session_dir / "iterations"
781
+ output_dir = session_dir / "output"
782
+
783
+ # Create directory structure
784
+ for dir_path in [content_dir, agents_dir, iterations_dir, output_dir]:
785
+ dir_path.mkdir(parents=True, exist_ok=True)
786
+
787
+ def update_metadata(self, metadata: Dict[str, Any]):
788
+ """Update content metadata"""
789
+ if not self.current_session:
790
+ return
791
+
792
+ session_dir = Path(f"storyboard/{self.current_session}")
793
+ content_dir = session_dir / "content"
794
+ content_dir.mkdir(parents=True, exist_ok=True)
795
+ metadata_file = content_dir / "metadata.json"
796
+
797
+ # Load existing metadata if it exists
798
+ existing_metadata = {}
799
+ if metadata_file.exists():
800
+ with open(metadata_file, "r") as f:
801
+ existing_metadata = json.load(f)
802
+
803
+ # Merge with new metadata
804
+ existing_metadata.update(metadata)
805
+ existing_metadata["updated_at"] = datetime.utcnow().isoformat() + "Z"
806
+
807
+ with open(metadata_file, "w") as f:
808
+ json.dump(existing_metadata, f, indent=2)
809
+
810
+ def save_iteration_data(self, iteration: int, data: Dict[str, Any]):
811
+ """Save iteration-specific data"""
812
+ if not self.current_session:
813
+ return
814
+
815
+ session_dir = Path(f"storyboard/{self.current_session}")
816
+ iterations_dir = session_dir / "iterations"
817
+ iterations_dir.mkdir(parents=True, exist_ok=True)
818
+ iteration_file = iterations_dir / f"v{iteration}_generation.json"
819
+
820
+ with open(iteration_file, "w") as f:
821
+ json.dump(data, f, indent=2)
822
+
823
+ def save_bayko_state(self, state_data: Dict[str, Any]):
824
+ """Save Bayko's current state"""
825
+ if not self.current_session:
826
+ return
827
+
828
+ session_dir = Path(f"storyboard/{self.current_session}")
829
+ agents_dir = session_dir / "agents"
830
+ agents_dir.mkdir(parents=True, exist_ok=True)
831
+ state_file = agents_dir / "bayko_state.json"
832
+
833
+ with open(state_file, "w") as f:
834
+ json.dump(state_data, f, indent=2)
835
+
836
+ def get_session_info(self) -> Dict[str, Any]:
837
+ """Get current session information (matches Brown's interface)"""
838
+ memory_size = 0
839
+ if self.memory:
840
+ try:
841
+ memory_size = self.memory.get_memory_size()
842
+ except:
843
+ memory_size = 0
844
+
845
+ return {
846
+ "session_id": self.current_session,
847
+ "memory_size": memory_size,
848
+ "generation_stats": self.generation_stats,
849
+ }
850
+
851
+ def _improve_prompt_with_feedback(
852
+ self,
853
+ original_prompt: str,
854
+ feedback: Dict[str, Any],
855
+ improvements: List[str],
856
+ ) -> str:
857
+ """
858
+ Use mini LLM to improve a prompt based on feedback
859
+
860
+ Args:
861
+ original_prompt: The original prompt to improve
862
+ feedback: Dictionary containing feedback data
863
+ improvements: List of specific improvements to make
864
+
865
+ Returns:
866
+ Improved prompt
867
+ """
868
+ if not self.llm:
869
+ # Fallback to original prompt if LLM is not initialized
870
+ logger.warning("LLM not available, using original prompt")
871
+ return original_prompt
872
+
873
+ try:
874
+ # Construct feedback context
875
+ feedback_str = ""
876
+ if feedback:
877
+ feedback_str = "Feedback:\n"
878
+ for key, value in feedback.items():
879
+ feedback_str += f"- {key}: {value}\n"
880
+
881
+ improvements_str = ""
882
+ if improvements:
883
+ improvements_str = "Requested improvements:\n"
884
+ for imp in improvements:
885
+ improvements_str += f"- {imp}\n"
886
+
887
+ system_prompt = """You are an expert at refining SDXL image generation prompts.
888
+ Analyze the feedback and improve the original prompt while maintaining its core narrative elements.
889
+ Focus on addressing the specific feedback points and requested improvements.
890
+ Return ONLY the improved prompt with no additional formatting or explanations."""
891
+
892
+ client = OpenAI()
893
+ response = client.chat.completions.create(
894
+ model="gpt-3.5-turbo",
895
+ messages=[
896
+ {"role": "system", "content": system_prompt},
897
+ {
898
+ "role": "user",
899
+ "content": f"Original prompt: {original_prompt}\n\n{feedback_str}\n{improvements_str}",
900
+ },
901
+ ],
902
+ temperature=0.5,
903
+ max_tokens=500,
904
+ )
905
+
906
+ if response.choices[0].message.content:
907
+ return response.choices[0].message.content.strip()
908
+
909
+ # Fallback if no valid response
910
+ return original_prompt
911
+
912
+ except Exception as e:
913
+ logger.error(f"Failed to improve prompt: {e}")
914
+ # Fallback to original prompt
915
+ return original_prompt
916
+
917
+
918
+ # Example usage and testing
919
+ async def main():
920
+ """Example usage of Agent Bayko leveraging the BaykoWorkflow"""
921
+ from agents.bayko_workflow import create_agent_bayko
922
+ import os
923
+
924
+ # Create Bayko agent using the factory function
925
+ bayko_workflow = create_agent_bayko(
926
+ os.getenv("OPENAI_API_KEY")
927
+ ) # Initialize a test session
928
+ from services.session_id_generator import SessionIdGenerator
929
+
930
+ session_id = SessionIdGenerator.create_session_id("test")
931
+ conversation_id = f"conv_{session_id}"
932
+ bayko_workflow.initialize_session(session_id, conversation_id)
933
+
934
+ test_message = {
935
+ "message_id": "msg_12345",
936
+ "timestamp": "2025-01-15T10:30:00Z",
937
+ "sender": "agent_brown",
938
+ "recipient": "agent_bayko",
939
+ "message_type": "generation_request",
940
+ "payload": {
941
+ "prompt": "A moody K-pop idol finds a puppy on the street. It changes everything. Visual style: whimsical, nature, soft_lighting, watercolor, mood: peaceful, color palette: warm_earth_tones",
942
+ "original_prompt": "A moody K-pop idol finds a puppy on the street. It changes everything.",
943
+ "style_tags": [
944
+ "whimsical",
945
+ "nature",
946
+ "soft_lighting",
947
+ "watercolor",
948
+ ],
949
+ "panels": 4,
950
+ "language": "korean",
951
+ "extras": ["narration", "subtitles"],
952
+ "style_config": {
953
+ "primary_style": "studio_ghibli",
954
+ "mood": "peaceful",
955
+ "color_palette": "warm_earth_tones",
956
+ "confidence": 0.9,
957
+ },
958
+ "generation_params": {
959
+ "quality": "high",
960
+ "aspect_ratio": "16:9",
961
+ "panel_layout": "sequential",
962
+ },
963
+ },
964
+ "context": {
965
+ "conversation_id": conversation_id,
966
+ "session_id": session_id,
967
+ "iteration": 1,
968
+ "previous_feedback": None,
969
+ "validation_score": 0.85,
970
+ },
971
+ }
972
+
973
+ # Process generation request
974
+ print("🎨 Processing generation request...")
975
+ result = bayko_workflow.process_generation_request(test_message)
976
+
977
+ # Print generation results
978
+ print("\n✨ Generation completed!")
979
+ print("-" * 40)
980
+ print(f"Result type: {type(result)}")
981
+
982
+ if isinstance(result, str):
983
+ # For workflow result
984
+ print(f"Generation Result: {result}")
985
+ else:
986
+ # For direct Bayko result
987
+ result_data = result.payload
988
+ print(f"\nPanels Generated: {len(result_data['panels'])}")
989
+ print(
990
+ f"Total Time: {result_data['metadata']['generation']['total_time']:.2f}s"
991
+ )
992
+ print(
993
+ f"Success Rate: {result_data['metadata']['generation']['panels_completed']}/{len(result_data['panels'])}"
994
+ )
995
+
996
+ if isinstance(result_data, dict) and result.get("metadata", {}).get(
997
+ "generation", {}
998
+ ).get("errors"):
999
+
1000
+ print("\n⚠️ Errors:")
1001
+ errors = (
1002
+ result_data.get("metadata", {})
1003
+ .get("generation", {})
1004
+ .get("errors")
1005
+ )
1006
+ if not isinstance(errors, list):
1007
+ errors = []
1008
+ for error in errors:
1009
+ print(f"- {error}")
1010
+
1011
+ print("\n✅ Test operations completed successfully!")
1012
+
1013
+
1014
+ if __name__ == "__main__":
1015
+ import asyncio
1016
+
1017
+ asyncio.run(main())
agents/bayko_tools.py ADDED
@@ -0,0 +1,219 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+ import base64
3
+
4
+ # import uuid # Currently unused
5
+ import asyncio
6
+ from datetime import datetime
7
+ from typing import Dict, List, Optional, Any, Tuple
8
+ from pathlib import Path
9
+ from dataclasses import dataclass, asdict
10
+ from enum import Enum
11
+ import logging
12
+ import time
13
+ import os
14
+ import random
15
+ import modal
16
+ import modal
17
+ import random
18
+
19
+ # Core services - Updated to match Brown's memory system
20
+ from services.unified_memory import AgentMemory
21
+ from services.session_manager import SessionManager as ServiceSessionManager
22
+ from services.message_factory import MessageFactory, AgentMessage, MessageType
23
+ from services.session_id_generator import SessionIdGenerator
24
+
25
+ # Configure logging
26
+ logging.basicConfig(level=logging.INFO)
27
+ logger = logging.getLogger(__name__)
28
+
29
+
30
+ class ModalImageGenerator:
31
+ """
32
+ MCP client implementation for SDXL Turbo image generation via Modal compute
33
+ """
34
+
35
+ def __init__(self):
36
+ self.model_loaded = False
37
+ logger.info("ModalImageGenerator initialized")
38
+ # Import the Modal app here to ensure it's only loaded when needed
39
+ try:
40
+ from tools.image_generator import generate_comic_panel, app
41
+
42
+ self.generate_panel = generate_comic_panel
43
+ self.app = app
44
+ self.model_loaded = True
45
+ except ImportError as e:
46
+ logger.error(f"Failed to import Modal image generator: {e}")
47
+
48
+ async def generate_panel_image(
49
+ self,
50
+ prompt: str,
51
+ style_tags: List[str], # Kept for backward compatibility but not used
52
+ panel_id: int,
53
+ session_id: str,
54
+ ) -> Tuple[str, float]:
55
+ """
56
+ Generate comic panel image using SDXL via Modal MCP
57
+ Returns tuple of (image_path, generation_time)
58
+ """
59
+ if not self.model_loaded:
60
+ raise RuntimeError(
61
+ "Modal image generator not properly initialized"
62
+ )
63
+
64
+ start_time = time.time()
65
+ try: # Call the Modal function directly
66
+ with self.app.run():
67
+ img_bytes, duration = await self.generate_panel.remote.aio(
68
+ prompt=prompt,
69
+ panel_id=panel_id,
70
+ session_id=session_id,
71
+ steps=1, # Using SDXL Turbo default
72
+ seed=42,
73
+ )
74
+
75
+ # Create output path and save the image
76
+ content_dir = Path(f"storyboard/{session_id}/content")
77
+ content_dir.mkdir(parents=True, exist_ok=True)
78
+ image_path = content_dir / f"panel_{panel_id}.png"
79
+
80
+ # Save the returned image bytes
81
+ with open(image_path, "wb") as f:
82
+ f.write(img_bytes)
83
+
84
+ generation_time = time.time() - start_time
85
+ logger.info(
86
+ f"Generated image for panel {panel_id} in {generation_time:.2f}s"
87
+ )
88
+ return str(image_path), generation_time
89
+
90
+ except Exception as e:
91
+ logger.error(f"Failed to generate image: {e}")
92
+ raise
93
+
94
+
95
+ class ModalCodeExecutor:
96
+ """
97
+ Modal code execution sandbox using fries.py for Python script generation and execution
98
+ """
99
+
100
+ def __init__(self):
101
+ self.app = None
102
+ self.generate_and_run = None
103
+ logger.info("ModalCodeExecutor initializing")
104
+ try:
105
+ from tools.fries import app, generate_and_run_script
106
+
107
+ self.app = app
108
+ # Store the Modal function directly like in ModalImageGenerator
109
+ self.generate_and_run = generate_and_run_script
110
+ logger.info("Successfully loaded Modal fries app")
111
+ except ImportError as e:
112
+ logger.error(f"Failed to import Modal fries app: {e}")
113
+
114
+ async def execute_code(
115
+ self, prompt: str, session_id: str
116
+ ) -> Tuple[str, float]:
117
+ """
118
+ Execute code in Modal sandbox for interactive comic elements using fries.py
119
+
120
+ Args:
121
+ prompt: The prompt to generate and run code for
122
+ session_id: Session identifier for file organization
123
+
124
+ Returns:
125
+ Tuple of (script_file_path, execution_time)
126
+ """
127
+ if not self.app or not self.generate_and_run:
128
+ raise RuntimeError("Modal fries app not properly initialized")
129
+
130
+ start_time = time.time()
131
+
132
+ try:
133
+ # Generate animal art with fries
134
+ animal = random.choice(
135
+ [
136
+ "cat",
137
+ "dog",
138
+ "fish",
139
+ "bird",
140
+ "giraffe",
141
+ "turtle",
142
+ "monkey",
143
+ "rabbit",
144
+ "puppy",
145
+ "animal",
146
+ ]
147
+ )
148
+
149
+ print(
150
+ f"\n🤖 Generating an ascii {animal} for you!"
151
+ ) # Execute code via Modal with the EXACT same prompt structure as main()
152
+ with self.app.run():
153
+ result = await self.generate_and_run.remote.aio(
154
+ f"""
155
+ create a simple ASCII art of a {animal}.
156
+ Create ASCII art using these characters: _ - = ~ ^ \\\\ / ( ) [ ] {{ }} < > | . o O @ *
157
+ Draw the art line by line with print statements.
158
+ Write a short, funny Python script.
159
+ Use only basic Python features.
160
+ Add a joke or pun about fries in the script.
161
+ Make it light-hearted and fun.
162
+ End with a message about fries.
163
+ Make sure the script runs without errors.
164
+ """,
165
+ session_id,
166
+ )
167
+
168
+ print("=" * 30)
169
+ print("\n🎮 Code Output:")
170
+ print("=" * 30)
171
+ print("\n\n")
172
+ print(result["output"])
173
+
174
+ print("🍟 🍟 🍟")
175
+ print("Golden crispy Python fries")
176
+ print("Coming right up!")
177
+ print()
178
+ print("Haha. Just kidding.")
179
+
180
+ script_file = f"storyboard/{session_id}/output/fries_for_you.py"
181
+ os.makedirs(os.path.dirname(script_file), exist_ok=True)
182
+
183
+ if result["code"]:
184
+ # Save the generated code locally
185
+ with open(script_file, "w") as f:
186
+ f.write(result["code"])
187
+ print("\nGo here to check out your actual custom code:")
188
+ print(f"👉 Code saved to: {script_file}")
189
+ print("\n\n\n")
190
+
191
+ if result["error"]:
192
+ print("\n❌ Error:")
193
+ print("=" * 40)
194
+ print(result["error"])
195
+ print("Looks like there was an error during execution.")
196
+ print("Here are some extra fries to cheer you up!")
197
+ print("🍟 🍟 🍟")
198
+ print(" 🍟 🍟 ")
199
+ print(" 🍟 ")
200
+ print("Now with extra machine-learned crispiness.")
201
+
202
+ execution_time = time.time() - start_time
203
+ return script_file, execution_time
204
+
205
+ except modal.exception.FunctionTimeoutError:
206
+ print(
207
+ "⏰ Script execution timed out after 300 seconds and 3 tries!"
208
+ )
209
+ print("Sorry but codestral is having a hard time drawing today.")
210
+ print("Here's a timeout fry for you! 🍟")
211
+ print("Here are some extra fries to cheer you up!")
212
+ print("🍟 🍟 🍟")
213
+ print(" 🍟 🍟 ")
214
+ print(" 🍟 ")
215
+ print("Now with extra machine-learned crispiness.")
216
+ raise
217
+ except Exception as e:
218
+ logger.error(f"Failed to execute code: {e}")
219
+ raise
agents/bayko_workflow.py ADDED
@@ -0,0 +1,328 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Agent Bayko LlamaIndex ReAct Workflow
3
+ Hackathon demo showcasing content generation with LLM-enhanced prompts and visible reasoning
4
+ """
5
+
6
+ import os
7
+ import json
8
+ import asyncio
9
+ from typing import Optional, Dict, Any, List
10
+ from openai import OpenAI as OpenAIClient
11
+ from llama_index.llms.openai import OpenAI as LlamaOpenAI
12
+ from llama_index.core.agent import ReActAgent
13
+ from llama_index.core.memory import ChatMemoryBuffer
14
+ from llama_index.core.tools import FunctionTool, BaseTool
15
+
16
+ # Standard library imports
17
+ from pathlib import Path
18
+ from datetime import datetime
19
+
20
+ # Core services
21
+ from services.unified_memory import AgentMemory
22
+ from services.session_id_generator import SessionIdGenerator
23
+ from services.session_manager import SessionManager
24
+ from services.message_factory import MessageFactory, AgentMessage
25
+ from agents.bayko import AgentBayko
26
+ from agents.bayko_tools import (
27
+ ModalImageGenerator,
28
+ ModalCodeExecutor,
29
+ )
30
+ from agents.bayko_workflow_tools import BaykoWorkflowTools
31
+
32
+ # Custom prompts
33
+ from prompts.bayko_workflow_system_prompt import BAYKO_WORKFLOW_SYSTEM_PROMPT
34
+
35
+ # Load environment variables
36
+ try:
37
+ from dotenv import load_dotenv
38
+
39
+ load_dotenv()
40
+ except ImportError:
41
+ pass
42
+
43
+ from datetime import datetime
44
+
45
+ openai_api_key = os.getenv("OPENAI_API_KEY")
46
+
47
+
48
+ """
49
+ Handle LLM interactions for prompt enhancement and generation flow
50
+ """
51
+
52
+
53
+ class BaykoWorkflow:
54
+ """
55
+ Agent Bayko Workflow using LlamaIndex ReActAgent
56
+ Showcases LLM-enhanced content generation with visible reasoning
57
+ """
58
+
59
+ def __init__(self, openai_api_key: Optional[str] = None):
60
+ # Initialize LLM if available and not explicitly disabled
61
+ self.llm = None
62
+
63
+ # Only initialize if api_key is explicitly provided (not None)
64
+ if openai_api_key is not None:
65
+ try:
66
+ self.llm = LlamaOpenAI(
67
+ model="gpt-4.1-mini",
68
+ api_key=openai_api_key, # Don't use env var
69
+ temperature=0.7,
70
+ max_tokens=2048,
71
+ )
72
+ print("✓ Initialized LlamaIndex LLM for enhanced prompts")
73
+ except Exception as e:
74
+ print(f"⚠️ Could not initialize LlamaIndex LLM: {e}")
75
+ self.llm = None
76
+
77
+ # Initialize core Bayko agent with matching LLM state
78
+ self.bayko_agent = AgentBayko()
79
+ if self.llm:
80
+ self.bayko_agent.llm = self.llm
81
+
82
+ # Initialize session services as None
83
+ self.session_manager = None
84
+ self.memory = None
85
+ self.message_factory = None
86
+
87
+ # Create Bayko tools
88
+ self.bayko_tools = BaykoWorkflowTools(self.bayko_agent)
89
+ self.tools = self.bayko_tools.create_llamaindex_tools()
90
+
91
+ # System prompt for ReAct agent
92
+ self.system_prompt = BAYKO_WORKFLOW_SYSTEM_PROMPT
93
+
94
+ # Initialize ReAct agent if we have LLM
95
+ self.agent = None
96
+ if self.llm:
97
+ from typing import cast
98
+
99
+ tools_list = cast(List[BaseTool], self.tools)
100
+ self.agent = ReActAgent.from_tools(
101
+ tools=tools_list,
102
+ llm=self.llm,
103
+ memory=ChatMemoryBuffer.from_defaults(token_limit=4000),
104
+ system_prompt=self.system_prompt,
105
+ verbose=True,
106
+ max_iterations=15,
107
+ )
108
+
109
+ def initialize_workflow(self):
110
+ """
111
+ Initialize core workflow components that don't require session context.
112
+ Session-specific services should be initialized via initialize_session().
113
+ """
114
+ # Only initialize the LLM-powered components if not already done
115
+ if self.llm and not self.agent:
116
+ # Initialize ReActAgent with LLM-powered tools
117
+ from typing import cast
118
+
119
+ tools_list = cast(List[BaseTool], self.tools)
120
+ self.agent = ReActAgent.from_tools(
121
+ tools=tools_list,
122
+ llm=self.llm,
123
+ memory=ChatMemoryBuffer.from_defaults(token_limit=4000),
124
+ system_prompt=self.system_prompt,
125
+ verbose=True,
126
+ max_iterations=15,
127
+ )
128
+ print("✓ Initialized ReActAgent with tools")
129
+
130
+ def initialize_session(
131
+ self, session_id: str, conversation_id: Optional[str] = None
132
+ ):
133
+ """
134
+ Initialize or reinitialize session-specific services for Bayko workflow.
135
+ This should be called before processing any requests to ensure proper session context.
136
+
137
+ Args:
138
+ session_id: Unique identifier for this generation session
139
+ conversation_id: Optional conversation ID. If not provided, will be derived from session_id
140
+ """
141
+ conversation_id = conversation_id or f"conv_{session_id}"
142
+
143
+ # Initialize or reinitialize session services
144
+ self.session_manager = SessionManager(session_id, conversation_id)
145
+ self.memory = AgentMemory(session_id, "bayko")
146
+ self.message_factory = MessageFactory(session_id, conversation_id)
147
+
148
+ # Update Bayko agent with session services
149
+ self.bayko_agent._initialize_session(session_id, conversation_id)
150
+
151
+ # If we have an LLM agent, ensure its memory is aware of the session
152
+ if self.agent and hasattr(self.agent, "memory"):
153
+ from llama_index.core.llms import ChatMessage, MessageRole
154
+
155
+ self.agent.memory = ChatMemoryBuffer.from_defaults(
156
+ token_limit=4000,
157
+ chat_history=[
158
+ ChatMessage(
159
+ role=MessageRole.SYSTEM,
160
+ content=f"Session ID: {session_id}",
161
+ )
162
+ ],
163
+ )
164
+
165
+ print(f"🧠 Bayko workflow initialized for session {session_id}")
166
+
167
+ def process_generation_request(self, request_data: Dict[str, Any]) -> str:
168
+ """
169
+ Process content generation request using ReActAgent
170
+
171
+ Args:
172
+ request_data: Structured request from Agent Brown
173
+
174
+ Returns:
175
+ Agent's response with generated content
176
+ """
177
+ if not self.agent:
178
+ return self._fallback_generation(request_data)
179
+
180
+ try:
181
+ print("🤖 Agent Bayko ReAct Workflow")
182
+ print("=" * 60)
183
+ print(f"📝 Request: {request_data.get('prompt', 'N/A')[:100]}...")
184
+ print("\n🔄 Bayko Processing...")
185
+ print("=" * 60)
186
+
187
+ # Format request for agent
188
+ request_prompt = f"""Generate comic content for the following request:
189
+
190
+ Original Prompt: {request_data.get('original_prompt', '')}
191
+ Enhanced Prompt: {request_data.get('prompt', '')}
192
+ Style Tags: {request_data.get('style_tags', [])}
193
+ Panels: {request_data.get('panels', 4)}
194
+ Session ID: {request_data.get('session_id', 'default')}
195
+
196
+ Please generate enhanced prompts and create the comic content."""
197
+
198
+ # Process with ReActAgent (shows Thought/Action/Observation)
199
+ response = self.agent.chat(request_prompt)
200
+
201
+ print("\n" + "=" * 60)
202
+ print("🎉 Agent Bayko Response:")
203
+ print("=" * 60)
204
+
205
+ return str(response)
206
+
207
+ except Exception as e:
208
+ error_msg = f"❌ Error in Bayko generation: {str(e)}"
209
+ print(error_msg)
210
+ return self._fallback_generation(request_data)
211
+
212
+ def _fallback_generation(self, request_data: Dict[str, Any]) -> str:
213
+ """Fallback generation when ReActAgent is unavailable"""
214
+ print("⚠️ Using fallback generation (no LLM agent)")
215
+
216
+ # Use core Bayko methods directly
217
+ panels = request_data.get("panels", 4)
218
+ style_tags = request_data.get("style_tags", [])
219
+
220
+ result = {
221
+ "status": "completed",
222
+ "method": "fallback",
223
+ "panels_generated": panels,
224
+ "style_tags_applied": style_tags,
225
+ "llm_enhanced": False,
226
+ "message": f"Generated {panels} panels using fallback methods",
227
+ }
228
+
229
+ return json.dumps(result, indent=2)
230
+
231
+ def reset(self):
232
+ """Reset workflow state for a new session."""
233
+ if self.memory:
234
+ self.memory.clear()
235
+ if self.agent and hasattr(self.agent, "memory"):
236
+ self.agent.memory = ChatMemoryBuffer.from_defaults(
237
+ token_limit=4000
238
+ )
239
+ self.current_session = None
240
+ self.session_manager = None
241
+
242
+
243
+ def create_agent_bayko(openai_api_key: Optional[str] = None) -> BaykoWorkflow:
244
+ """
245
+ Factory function to create and initialize Agent Bayko workflow
246
+
247
+ This function creates a BaykoWorkflow instance with proper initialization of:
248
+ - LlamaIndex LLM for enhanced prompts
249
+ - ReActAgent with function tools
250
+ - Memory and session services
251
+ - Core Bayko agent with tools
252
+
253
+ Args:
254
+ openai_api_key: OpenAI API key for LLM functionality. If None, will try to use environment variable.
255
+
256
+ Returns:
257
+ Configured BaykoWorkflow instance with all components initialized
258
+ """
259
+ # Create workflow instance
260
+ workflow = BaykoWorkflow(openai_api_key=openai_api_key)
261
+
262
+ # Initialize all workflow components
263
+ workflow.initialize_workflow()
264
+
265
+ return workflow
266
+
267
+
268
+ # Example usage for testing
269
+ def main():
270
+ """Example usage of Bayko Workflow"""
271
+
272
+ # Check for API key
273
+ if not os.getenv("OPENAI_API_KEY"):
274
+ print("❌ Please set OPENAI_API_KEY environment variable")
275
+ return
276
+
277
+ # Create workflow with API key
278
+ workflow = create_agent_bayko(os.getenv("OPENAI_API_KEY"))
279
+
280
+ # Test request
281
+ test_request = {
282
+ "prompt": "A melancholic K-pop idol discovers a lost puppy in the neon-lit streets of Seoul at night. The encounter changes everything.",
283
+ "original_prompt": "A K-pop idol finds a puppy that changes their life",
284
+ "style_tags": [
285
+ "anime",
286
+ "soft_lighting",
287
+ "emotional",
288
+ "watercolor",
289
+ "night_scene",
290
+ "neon_lights",
291
+ ],
292
+ "panels": 4,
293
+ "language": "korean",
294
+ "extras": ["narration", "subtitles"],
295
+ "session_id": SessionIdGenerator.create_session_id("test"),
296
+ } # Initialize session
297
+ session_id = SessionIdGenerator.create_session_id("test")
298
+ conversation_id = f"conv_{session_id}"
299
+ workflow.initialize_session(session_id, conversation_id)
300
+
301
+ # Process request with visible reasoning
302
+ print("\n🤖 Testing Bayko Workflow with LLM Reasoning")
303
+ print("=" * 80)
304
+ print(f"📝 Original prompt: {test_request['original_prompt']}")
305
+ print(f"✨ Enhanced prompt: {test_request['prompt']}")
306
+ print(f"🎨 Style tags: {test_request['style_tags']}")
307
+ print(f"🖼️ Panels: {test_request['panels']}")
308
+ print("\n🔄 Processing generation request...")
309
+
310
+ try:
311
+ result = workflow.process_generation_request(test_request)
312
+ print("\n🎉 Generation completed!")
313
+ print("\n📋 Generation Result:")
314
+ print("=" * 40)
315
+ print(result)
316
+ print("\n✅ Test completed successfully!")
317
+ except Exception as e:
318
+ print(f"\n❌ Error during generation: {e}")
319
+ print("⚠️ Attempting fallback generation...")
320
+ result = workflow._fallback_generation(test_request)
321
+ print("\n📋 Fallback Result:")
322
+ print("=" * 40)
323
+ print(result)
324
+ print("\n⚠️ Test completed with fallback")
325
+
326
+
327
+ if __name__ == "__main__":
328
+ main()
agents/bayko_workflow_tools.py ADDED
@@ -0,0 +1,237 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Tool wrapper class for Agent Bayko's LLM-enhanced workflow methods
3
+ """
4
+
5
+ import json
6
+ import asyncio
7
+ from typing import Dict, Any, List
8
+ from llama_index.core.tools import FunctionTool
9
+
10
+ from agents.bayko import AgentBayko
11
+ from agents.bayko_tools import ModalImageGenerator, ModalCodeExecutor
12
+
13
+
14
+ class BaykoWorkflowTools:
15
+ """Tool wrapper class for Agent Bayko's LLM-enhanced methods"""
16
+
17
+ def __init__(self, bayko_agent: AgentBayko):
18
+ self.bayko = bayko_agent
19
+
20
+ def generate_enhanced_prompt_tool(
21
+ self, description: str, style_tags: str = "[]", mood: str = "neutral"
22
+ ) -> str:
23
+ """Generate LLM-enhanced prompt for SDXL image generation from panel description."""
24
+ try:
25
+ style_tags_list = json.loads(style_tags) if style_tags else []
26
+ except:
27
+ style_tags_list = []
28
+
29
+ result = self.bayko.generate_prompt_from_description(
30
+ description, style_tags_list, mood
31
+ )
32
+
33
+ return json.dumps(
34
+ {
35
+ "enhanced_prompt": result,
36
+ "original_description": description,
37
+ "style_tags": style_tags_list,
38
+ "mood": mood,
39
+ "llm_used": self.bayko.llm is not None,
40
+ }
41
+ )
42
+
43
+ def revise_panel_description_tool(
44
+ self, description: str, feedback: str = "{}", focus_areas: str = "[]"
45
+ ) -> str:
46
+ """Revise panel description based on Agent Brown's feedback using LLM."""
47
+ try:
48
+ feedback_dict = json.loads(feedback) if feedback else {}
49
+ focus_areas_list = json.loads(focus_areas) if focus_areas else []
50
+ except:
51
+ feedback_dict = {}
52
+ focus_areas_list = []
53
+
54
+ result = self.bayko.revise_panel_description(
55
+ description, feedback_dict, focus_areas_list
56
+ )
57
+
58
+ return json.dumps(
59
+ {
60
+ "revised_description": result,
61
+ "original_description": description,
62
+ "feedback_applied": feedback_dict,
63
+ "focus_areas": focus_areas_list,
64
+ "llm_used": self.bayko.llm is not None,
65
+ }
66
+ )
67
+
68
+ async def generate_panel_content_tool(self, panel_data: str) -> str:
69
+ """Generate complete panel content including image, audio, subtitles, and code execution concurrently."""
70
+ try:
71
+ data = json.loads(panel_data)
72
+ except:
73
+ return json.dumps({"error": "Invalid panel data JSON"})
74
+
75
+ # Extract panel information
76
+ panel_id = data.get("panel_id", 1)
77
+ description = data.get("description", "")
78
+ enhanced_prompt = data.get("enhanced_prompt", "")
79
+ style_tags = data.get("style_tags", [])
80
+ # language = data.get("language", "english")
81
+ # extras = data.get("extras", [])
82
+ session_id = data.get("session_id", "default")
83
+ # dialogues = data.get("dialogues", [])
84
+ # code_snippets = data.get("code_snippets", [])
85
+
86
+ # Initialize Modal tools
87
+
88
+ image_gen = ModalImageGenerator()
89
+ # tts_gen = TTSGenerator()
90
+ # subtitle_gen = SubtitleGenerator()
91
+ code_executor = ModalCodeExecutor()
92
+
93
+ # Create concurrent tasks for parallel execution
94
+ tasks = []
95
+
96
+ # 1. Always generate image
97
+ tasks.append(
98
+ image_gen.generate_panel_image(
99
+ enhanced_prompt, style_tags, panel_id, session_id
100
+ )
101
+ )
102
+
103
+ # 4. Execute code if provided
104
+ if code_snippets and panel_id <= len(code_snippets):
105
+ code_data = (
106
+ code_snippets[panel_id - 1]
107
+ if isinstance(code_snippets, list)
108
+ else code_snippets
109
+ )
110
+ if isinstance(code_data, dict):
111
+ code = code_data.get("code", "")
112
+ code_language = code_data.get("language", "python")
113
+ context = code_data.get("context", description)
114
+ else:
115
+ code = str(code_data)
116
+ code_language = "python"
117
+ context = description
118
+
119
+ if code.strip():
120
+ tasks.append(code_executor.execute_code(prompt, session_id))
121
+ else:
122
+ tasks.append(
123
+ asyncio.create_task(asyncio.sleep(0))
124
+ ) # No-op task
125
+ else:
126
+ tasks.append(asyncio.create_task(asyncio.sleep(0))) # No-op task
127
+
128
+ # Execute all tasks concurrently
129
+ start_time = asyncio.get_event_loop().time()
130
+ results = await asyncio.gather(*tasks, return_exceptions=True)
131
+ total_time = asyncio.get_event_loop().time() - start_time
132
+
133
+ # Process results safely
134
+ def safe_get_path(result):
135
+ if isinstance(result, Exception) or result is None:
136
+ return None
137
+ if isinstance(result, tuple) and len(result) >= 1:
138
+ return result[0]
139
+ return None
140
+
141
+ def safe_check_exists(result):
142
+ path = safe_get_path(result)
143
+ return path is not None
144
+
145
+ image_path = safe_get_path(results[0])
146
+ audio_path = safe_get_path(results[1])
147
+ subtitle_path = safe_get_path(results[2])
148
+ code_path = safe_get_path(results[3])
149
+
150
+ # Build result
151
+ result = {
152
+ "panel_id": panel_id,
153
+ "description": description,
154
+ "enhanced_prompt": enhanced_prompt,
155
+ "image_path": image_path,
156
+ "image_url": f"file://{image_path}" if image_path else None,
157
+ "audio_path": audio_path,
158
+ "subtitles_path": subtitle_path,
159
+ "code_result_path": code_path,
160
+ "style_applied": style_tags,
161
+ "generation_time": total_time,
162
+ "status": "completed",
163
+ "concurrent_execution": True,
164
+ "tasks_completed": {
165
+ "image": image_path is not None,
166
+ "audio": audio_path is not None,
167
+ "subtitles": subtitle_path is not None,
168
+ "code": code_path is not None,
169
+ },
170
+ }
171
+
172
+ return json.dumps(result)
173
+
174
+ def get_session_info_tool(self) -> str:
175
+ """Get current Bayko session information and memory state."""
176
+ info = self.bayko.get_session_info()
177
+ return json.dumps(
178
+ {
179
+ "session_id": info.get("session_id"),
180
+ "memory_size": info.get("memory_size", 0),
181
+ "generation_stats": info.get("generation_stats", {}),
182
+ "llm_available": self.bayko.llm is not None,
183
+ "status": "active" if info.get("session_id") else "inactive",
184
+ }
185
+ )
186
+
187
+ def save_llm_data_tool(self, data_type: str, data: str) -> str:
188
+ """Save LLM generation or revision data to session storage."""
189
+ try:
190
+ data_dict = json.loads(data)
191
+ except:
192
+ return json.dumps({"error": "Invalid data JSON"})
193
+
194
+ if data_type == "generation":
195
+ self.bayko._save_llm_generation_data(data_dict)
196
+ elif data_type == "revision":
197
+ self.bayko._save_llm_revision_data(data_dict)
198
+ else:
199
+ return json.dumps({"error": "Invalid data type"})
200
+
201
+ return json.dumps(
202
+ {
203
+ "status": "saved",
204
+ "data_type": data_type,
205
+ "session_id": self.bayko.current_session,
206
+ }
207
+ )
208
+
209
+ def create_llamaindex_tools(self) -> List[FunctionTool]:
210
+ """Create LlamaIndex FunctionTools from Bayko's LLM-enhanced methods"""
211
+ return [
212
+ FunctionTool.from_defaults(
213
+ fn=self.generate_enhanced_prompt_tool,
214
+ name="generate_enhanced_prompt",
215
+ description="Generate LLM-enhanced prompt for SDXL image generation. Takes panel description, style tags, and mood. Returns enhanced prompt optimized for text-to-image models.",
216
+ ),
217
+ FunctionTool.from_defaults(
218
+ fn=self.revise_panel_description_tool,
219
+ name="revise_panel_description",
220
+ description="Revise panel description based on Agent Brown's feedback using LLM. Takes original description, feedback, and focus areas. Returns improved description.",
221
+ ),
222
+ FunctionTool.from_defaults(
223
+ async_fn=self.generate_panel_content_tool,
224
+ name="generate_panel_content",
225
+ description="Generate complete panel content including image, audio, subtitles, and code execution concurrently. Takes panel data JSON with description, style, and generation parameters.",
226
+ ),
227
+ FunctionTool.from_defaults(
228
+ fn=self.get_session_info_tool,
229
+ name="get_session_info",
230
+ description="Get current Bayko session information including memory state and generation statistics.",
231
+ ),
232
+ FunctionTool.from_defaults(
233
+ fn=self.save_llm_data_tool,
234
+ name="save_llm_data",
235
+ description="Save LLM generation or revision data to session storage. Takes data type ('generation' or 'revision') and data JSON.",
236
+ ),
237
+ ]
agents/brown.py ADDED
@@ -0,0 +1,641 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Agent Brown - The Orchestrator Agent
3
+
4
+ Agent Brown is the front-facing orchestrator that handles:
5
+ - Prompt validation and moderation
6
+ - Style tagging and enhancement
7
+ - JSON packaging for Agent Bayko
8
+ - Feedback review and refinement requests
9
+ - Session state management via LlamaIndex
10
+
11
+ This is the entry point for all user requests and manages the multi-turn
12
+ feedback loop with Agent Bayko for iterative comic generation.
13
+
14
+ Core AgentBrown class with validation, processing, and review capabilities
15
+ """
16
+
17
+ import uuid
18
+ import logging
19
+ from typing import Dict, List, Optional, Any
20
+ from dataclasses import dataclass, asdict
21
+ from enum import Enum
22
+ import os
23
+ import json
24
+ from pathlib import Path
25
+ from datetime import datetime
26
+
27
+ # LlamaIndex imports for multimodal ReActAgent
28
+ try:
29
+ from llama_index.multi_modal_llms.openai import OpenAIMultiModal
30
+ from llama_index.core.agent import ReActAgent
31
+ from llama_index.core.memory import ChatMemoryBuffer
32
+ from llama_index.core.tools import FunctionTool, BaseTool
33
+ from llama_index.core.llms import (
34
+ ChatMessage,
35
+ ImageBlock,
36
+ TextBlock,
37
+ MessageRole,
38
+ )
39
+
40
+ from llama_index.core.schema import ImageNode, Document
41
+ from llama_index.core import SimpleDirectoryReader
42
+ from typing import cast
43
+ except ImportError:
44
+ OpenAIMultiModal = None
45
+ ReActAgent = None
46
+ ChatMemoryBuffer = None
47
+ FunctionTool = None
48
+ BaseTool = None
49
+ ChatMessage = None
50
+ ImageBlock = None
51
+ TextBlock = None
52
+ MessageRole = None
53
+ ImageNode = None
54
+ Document = None
55
+
56
+ # Core services
57
+ from services.unified_memory import AgentMemory
58
+ from services.simple_evaluator import SimpleEvaluator
59
+ from services.content_moderator import ContentModerator
60
+ from services.style_tagger import StyleTagger
61
+ from services.message_factory import MessageFactory, AgentMessage, MessageType
62
+ from services.session_manager import SessionManager
63
+
64
+ # Configure logging
65
+ logging.basicConfig(level=logging.INFO)
66
+ logger = logging.getLogger(__name__)
67
+
68
+
69
+ class ValidationStatus(Enum):
70
+ """Validation result statuses"""
71
+
72
+ VALID = "valid"
73
+ INVALID = "invalid"
74
+ WARNING = "warning"
75
+
76
+
77
+ @dataclass
78
+ class ValidationResult:
79
+ """Result of input validation"""
80
+
81
+ status: ValidationStatus
82
+ issues: List[str]
83
+ suggestions: List[str]
84
+ confidence_score: float
85
+
86
+ def is_valid(self) -> bool:
87
+ return self.status == ValidationStatus.VALID
88
+
89
+
90
+ @dataclass
91
+ class StoryboardRequest:
92
+ """Incoming request from user interface"""
93
+
94
+ prompt: str
95
+ style_preference: Optional[str] = None
96
+ panels: int = 4
97
+ language: str = "english"
98
+ extras: Optional[List[str]] = None
99
+
100
+ def __post_init__(self):
101
+ if self.extras is None:
102
+ self.extras = []
103
+
104
+
105
+ class AgentBrown:
106
+ """
107
+ Agent Brown - The Orchestrator
108
+
109
+ Main responsibilities:
110
+ - Validate and moderate user input
111
+ - Analyze and tag visual styles
112
+ - Package requests for Agent Bayko
113
+ - Review generated content and provide feedback
114
+ - Manage multi-turn refinement loops
115
+ - Maintain session state and memory
116
+ """
117
+
118
+ def __init__(
119
+ self, max_iterations: int = 3, openai_api_key: Optional[str] = None
120
+ ):
121
+ self.max_iterations = max_iterations
122
+ self.session_id = None
123
+ self.conversation_id = None
124
+ self.iteration_count = 0
125
+
126
+ # Initialize LLM for prompt enhancement
127
+ self.llm = None
128
+ try:
129
+ if OpenAIMultiModal:
130
+ self.llm = OpenAIMultiModal(
131
+ model="gpt-4o",
132
+ api_key=openai_api_key or os.getenv("OPENAI_API_KEY"),
133
+ temperature=0.7,
134
+ max_tokens=2048,
135
+ )
136
+ logger.info("✓ Initialized GPT-4V")
137
+ except Exception as e:
138
+ logger.warning(f"⚠️ Could not initialize LLM: {e}")
139
+
140
+ # Core services
141
+ self.moderator = ContentModerator()
142
+ self.style_tagger = StyleTagger()
143
+ self.evaluator = SimpleEvaluator()
144
+
145
+ # Session services (initialized later)
146
+ self.memory = None
147
+ self.message_factory = None
148
+ self.session_manager = None
149
+
150
+ logger.info("Agent Brown initialized with core services")
151
+
152
+ def validate_input(self, request: StoryboardRequest) -> ValidationResult:
153
+ """
154
+ Validate user input for appropriateness and completeness
155
+
156
+ Args:
157
+ request: User's storyboard request
158
+
159
+ Returns:
160
+ ValidationResult with status and feedback
161
+ """
162
+ issues = []
163
+ suggestions = []
164
+
165
+ # Basic validation
166
+ if not request.prompt or len(request.prompt.strip()) < 10:
167
+ issues.append(
168
+ "Prompt too short - needs more detail for story generation"
169
+ )
170
+ suggestions.append(
171
+ "Add more context about characters, setting, emotions, or plot"
172
+ )
173
+
174
+ if len(request.prompt) > 1000:
175
+ issues.append("Prompt too long - may lose focus during generation")
176
+ suggestions.append(
177
+ "Condense to key story elements and main narrative arc"
178
+ )
179
+
180
+ # Content moderation
181
+ is_safe, moderation_issues = self.moderator.check_content(
182
+ request.prompt
183
+ )
184
+ if not is_safe:
185
+ issues.extend(moderation_issues)
186
+ suggestions.append(
187
+ "Please revise content to ensure it's family-friendly"
188
+ )
189
+
190
+ # Panel count validation
191
+ if request.panels < 1 or request.panels > 12:
192
+ issues.append(
193
+ f"Panel count ({request.panels}) outside recommended range (1-12)"
194
+ )
195
+ suggestions.append("Use 3-6 panels for optimal storytelling flow")
196
+
197
+ # Language validation
198
+ supported_languages = [
199
+ "english",
200
+ "korean",
201
+ "japanese",
202
+ "spanish",
203
+ "french",
204
+ ]
205
+ if request.language.lower() not in supported_languages:
206
+ issues.append(
207
+ f"Language '{request.language}' may not be fully supported"
208
+ )
209
+ suggestions.append(
210
+ f"Consider using: {', '.join(supported_languages)}"
211
+ )
212
+
213
+ # Calculate confidence score
214
+ confidence = max(
215
+ 0.0, 1.0 - (len(issues) * 0.3) - (len(suggestions) * 0.1)
216
+ )
217
+
218
+ # Determine status
219
+ if issues:
220
+ status = ValidationStatus.INVALID
221
+ elif suggestions:
222
+ status = ValidationStatus.WARNING
223
+ else:
224
+ status = ValidationStatus.VALID
225
+
226
+ result = ValidationResult(
227
+ status=status,
228
+ issues=issues,
229
+ suggestions=suggestions,
230
+ confidence_score=confidence,
231
+ )
232
+
233
+ # Log validation to memory
234
+ if self.memory:
235
+ self.memory.add_message(
236
+ "assistant",
237
+ f"Validated input: {result.status.value} (confidence: {confidence:.2f})",
238
+ )
239
+
240
+ return result
241
+
242
+ def _ensure_session(self) -> bool:
243
+ """Ensure session services are initialized"""
244
+ if not all([self.memory, self.message_factory, self.session_manager]):
245
+ logger.warning("Session services not initialized")
246
+ self._initialize_session()
247
+ return True
248
+
249
+ def _safe_memory_add(self, role: str, content: str) -> None:
250
+ """Safely add message to memory if available"""
251
+ if self.memory:
252
+ self.memory.add_message(role, content)
253
+
254
+ def process_request(self, request: StoryboardRequest) -> AgentMessage:
255
+ """Process incoming user request and create message for Agent Bayko"""
256
+ self._ensure_session()
257
+ logger.info(f"Processing request for session {self.session_id}")
258
+
259
+ # Log user request and state to memory
260
+ self._safe_memory_add(
261
+ "system",
262
+ f"Starting new request with session_id: {self.session_id}",
263
+ )
264
+ self._safe_memory_add("user", f"Original prompt: {request.prompt}")
265
+ self._safe_memory_add(
266
+ "system",
267
+ f"Request parameters: {json.dumps(asdict(request), indent=2)}",
268
+ )
269
+
270
+ # Step 1: Validate input
271
+ validation = self.validate_input(request)
272
+ self._safe_memory_add(
273
+ "system",
274
+ f"Validation result: {json.dumps(asdict(validation), indent=2)}",
275
+ )
276
+
277
+ if not validation.is_valid():
278
+ self._safe_memory_add(
279
+ "system", f"Validation failed: {validation.issues}"
280
+ )
281
+ return self.message_factory.create_error_message(
282
+ validation.issues, validation.suggestions
283
+ )
284
+
285
+ # Step 2: Use LLM to enhance prompt and analyze style
286
+ try:
287
+ if self.llm:
288
+ enhancement_prompt = f"""Enhance this comic story prompt for visual storytelling:
289
+ Original: {request.prompt}
290
+ Style preference: {request.style_preference or 'any'}
291
+ Panels: {request.panels}
292
+
293
+ Provide:
294
+ 1. Enhanced story description
295
+ 2. Visual style suggestions
296
+ 3. Mood and atmosphere
297
+ 4. Color palette recommendations"""
298
+
299
+ self._safe_memory_add(
300
+ "system", f"Sending prompt to LLM:\n{enhancement_prompt}"
301
+ )
302
+
303
+ enhancement = self.llm.complete(
304
+ enhancement_prompt, image_documents=[]
305
+ ).text
306
+ self._safe_memory_add(
307
+ "assistant", f"LLM enhanced prompt:\n{enhancement}"
308
+ )
309
+ else:
310
+ enhancement = request.prompt
311
+ self._safe_memory_add(
312
+ "system", "No LLM available, using original prompt"
313
+ )
314
+
315
+ except Exception as e:
316
+ logger.error(f"LLM enhancement failed: {e}")
317
+ enhancement = request.prompt
318
+ self._safe_memory_add("system", f"LLM enhancement failed: {e}")
319
+
320
+ # Step 3: Analyze and tag style
321
+ style_analysis = self.style_tagger.analyze_style(
322
+ enhancement, request.style_preference
323
+ )
324
+ self._safe_memory_add(
325
+ "system",
326
+ f"Style analysis: {json.dumps(asdict(style_analysis), indent=2)}",
327
+ )
328
+
329
+ # Step 4: Create message for Bayko
330
+ if not self.message_factory:
331
+ self._initialize_session()
332
+ # Provide an empty list or appropriate dialogues if not available
333
+ message = self.message_factory.create_generation_request(
334
+ enhanced_prompt=enhancement,
335
+ original_prompt=request.prompt,
336
+ style_tags=style_analysis.style_tags,
337
+ panels=request.panels,
338
+ language=request.language,
339
+ extras=request.extras or [],
340
+ style_config={
341
+ "primary_style": style_analysis.detected_style,
342
+ "mood": style_analysis.mood,
343
+ "color_palette": style_analysis.color_palette,
344
+ "confidence": style_analysis.confidence,
345
+ },
346
+ validation_score=validation.confidence_score,
347
+ iteration=self.iteration_count,
348
+ dialogues=[], # Add this argument as required by the method signature
349
+ )
350
+
351
+ # Log to memory and save state
352
+ if self.memory:
353
+ self.memory.add_message(
354
+ "assistant",
355
+ f"Created generation request for Bayko with {len(style_analysis.style_tags)} style tags",
356
+ )
357
+ if not self.session_manager:
358
+ self._initialize_session()
359
+ if self.session_manager and self.memory:
360
+ self.session_manager.save_session_state(
361
+ message,
362
+ asdict(request),
363
+ self.memory.get_history(),
364
+ self.iteration_count,
365
+ )
366
+
367
+ logger.info(f"Generated request message {message.message_id}")
368
+ return message
369
+
370
+ def _safe_image_to_node(self, doc: Document) -> Optional[ImageNode]:
371
+ """Safely convert document to ImageNode"""
372
+ try:
373
+ if hasattr(doc, "image") and doc.image:
374
+ return ImageNode(text=doc.text or "", image=doc.image)
375
+ except Exception as e:
376
+ self._safe_memory_add(
377
+ "system", f"Failed to convert image to node: {e}"
378
+ )
379
+ return None
380
+
381
+ def _safe_memory_add(self, role: str, content: str) -> None:
382
+ """Safely add message to memory if available"""
383
+ self._ensure_session()
384
+ if self.memory:
385
+ self.memory.add_message(role, content)
386
+
387
+ async def review_output(
388
+ self,
389
+ bayko_response: Dict[str, Any],
390
+ original_request: StoryboardRequest,
391
+ ) -> Optional[AgentMessage]:
392
+ """Review Agent Bayko's output using GPT-4o for image analysis"""
393
+ self._ensure_session()
394
+
395
+ # Log review start
396
+ self._safe_memory_add(
397
+ "system",
398
+ f"""Starting review with GPT-4o: {json.dumps({
399
+ 'prompt': original_request.prompt,
400
+ 'panels': len(bayko_response.get('panels', [])),
401
+ 'iteration': self.iteration_count + 1
402
+ }, indent=2)}""",
403
+ )
404
+
405
+ try:
406
+ if not self.llm:
407
+ raise ValueError("GPT-4o LLM not initialized")
408
+
409
+ if "panels" not in bayko_response:
410
+ raise ValueError("No panels found in Bayko's response")
411
+
412
+ # Get session content directory
413
+ content_dir = Path(f"storyboard/{self.session_id}/content")
414
+ if not content_dir.exists():
415
+ raise ValueError(f"Content directory not found: {content_dir}")
416
+
417
+ # Prepare image files for analysis
418
+ image_files = []
419
+ for panel in bayko_response["panels"]:
420
+ panel_path = content_dir / f"panel_{panel['id']}.png"
421
+ if panel_path.exists():
422
+ image_files.append(str(panel_path))
423
+ else:
424
+ self._safe_memory_add(
425
+ "system",
426
+ f"Warning: Panel image not found: {panel_path}",
427
+ )
428
+
429
+ if not image_files:
430
+ raise ValueError("No panel images found for review")
431
+
432
+ # Load images using SimpleDirectoryReader
433
+ reader = SimpleDirectoryReader(input_files=image_files)
434
+ raw_docs = reader.load_data()
435
+
436
+ # Convert documents to ImageNodes
437
+ image_nodes = []
438
+ for doc in raw_docs:
439
+ if node := self._safe_image_to_node(doc):
440
+ image_nodes.append(node)
441
+
442
+ if not image_nodes:
443
+ raise ValueError("Failed to load any valid images for review")
444
+
445
+ self._safe_memory_add(
446
+ "system",
447
+ f"Successfully loaded {len(image_nodes)} images for GPT-4o review",
448
+ )
449
+
450
+ # Construct detailed review prompt
451
+ review_prompt = f"""As an expert art director, analyze these comic panels against the user's original request:
452
+
453
+ ORIGINAL REQUEST: {original_request.prompt}
454
+ STYLE PREFERENCE: {original_request.style_preference or 'Not specified'}
455
+ REQUESTED PANELS: {original_request.panels}
456
+
457
+ Analyze the following aspects:
458
+ 1. Story Accuracy:
459
+ - Do the panels accurately depict the requested story?
460
+ - Are the main story beats present?
461
+
462
+ 2. Visual Storytelling:
463
+ - Is the panel flow clear and logical?
464
+ - Does the sequence effectively convey the narrative?
465
+
466
+ 3. Style & Aesthetics:
467
+ - Does it match any requested style preferences?
468
+ - Is the artistic quality consistent?
469
+
470
+ 4. Technical Quality:
471
+ - Are the images clear and well-composed?
472
+ - Is there appropriate detail and contrast?
473
+
474
+ Make ONE of these decisions:
475
+ - APPROVE: If panels successfully tell the story and meet quality standards
476
+ - REFINE: If specific improvements would enhance the result (list them)
477
+ - REJECT: If fundamental issues require complete regeneration
478
+
479
+ Provide a clear, actionable analysis focusing on how well these panels fulfill the USER'S ORIGINAL REQUEST."""
480
+
481
+ # Get GPT-4o analysis
482
+ analysis = self.llm.complete(
483
+ prompt=review_prompt, image_documents=image_nodes
484
+ ).text
485
+
486
+ self._safe_memory_add("assistant", f"GPT-4o Analysis:\n{analysis}")
487
+
488
+ # Parse decision from analysis
489
+ decision = "refine" # Default to refine
490
+ if "APPROVE" in analysis.upper():
491
+ decision = "approve"
492
+ elif "REJECT" in analysis.upper():
493
+ decision = "reject"
494
+
495
+ # Create evaluation result
496
+ evaluation = {
497
+ "decision": decision,
498
+ "reason": analysis,
499
+ "confidence": 0.85, # High confidence with GPT-4o
500
+ "original_prompt": original_request.prompt,
501
+ "analyzed_panels": len(image_nodes),
502
+ "style_match": original_request.style_preference or "any",
503
+ }
504
+
505
+ self._safe_memory_add(
506
+ "system",
507
+ f"""GPT-4o review complete:\n{json.dumps({
508
+ 'decision': decision,
509
+ 'confidence': 0.85,
510
+ 'analyzed_panels': len(image_nodes)
511
+ }, indent=2)}""",
512
+ )
513
+
514
+ except Exception as e:
515
+ logger.error(f"GPT-4o review failed: {str(e)}")
516
+ self._safe_memory_add(
517
+ "system",
518
+ f"GPT-4o review failed, falling back to basic evaluator: {str(e)}",
519
+ )
520
+ # Fallback to basic evaluator
521
+ evaluation = self.evaluator.evaluate(
522
+ bayko_response, original_request.prompt
523
+ )
524
+
525
+ # Ensure message factory is available
526
+ if not self.message_factory:
527
+ self._initialize_session()
528
+
529
+ # Create appropriate response message
530
+ if evaluation["decision"] == "approve":
531
+ return self.message_factory.create_approval_message(
532
+ bayko_response, evaluation, self.iteration_count
533
+ )
534
+ elif evaluation["decision"] == "reject":
535
+ return self.message_factory.create_rejection_message(
536
+ bayko_response, evaluation, self.iteration_count
537
+ )
538
+ else:
539
+ return self.message_factory.create_refinement_message(
540
+ bayko_response, evaluation, self.iteration_count
541
+ )
542
+
543
+ def get_session_info(self) -> Dict[str, Any]:
544
+ """Get current session information"""
545
+ memory_size = 0
546
+ if self.memory:
547
+ try:
548
+ memory_size = len(self.memory.get_history())
549
+ except:
550
+ memory_size = 0
551
+
552
+ return {
553
+ "session_id": self.session_id,
554
+ "conversation_id": self.conversation_id,
555
+ "iteration_count": self.iteration_count,
556
+ "memory_size": memory_size,
557
+ "max_iterations": self.max_iterations,
558
+ }
559
+
560
+ def _initialize_session(
561
+ self,
562
+ session_id: Optional[str] = None,
563
+ conversation_id: Optional[str] = None,
564
+ ):
565
+ """Initialize a new session with optional existing IDs or generate new ones"""
566
+ if not self.session_manager:
567
+ self.session_manager = SessionManager()
568
+
569
+ if not session_id:
570
+ session_id = str(uuid.uuid4())
571
+
572
+ if not conversation_id:
573
+ conversation_id = str(uuid.uuid4())
574
+
575
+ self.session_id = session_id
576
+ self.conversation_id = conversation_id
577
+ # Initialize session-specific services
578
+ self.memory = AgentMemory(self.session_id, "brown")
579
+ self.message_factory = MessageFactory(
580
+ self.session_id, self.conversation_id
581
+ )
582
+ self.session_manager = SessionManager(
583
+ self.session_id, self.conversation_id
584
+ )
585
+
586
+ # Log initialization
587
+ logger.info(
588
+ f"🧠 Brown initialized memory for session {self.session_id}"
589
+ )
590
+ if self.memory:
591
+ self.memory.add_message(
592
+ "system", f"Session initialized with ID: {self.session_id}"
593
+ )
594
+
595
+
596
+ # Example usage and testing
597
+ def main():
598
+ """Example usage of Agent Brown"""
599
+ # Create Brown instance
600
+ brown = AgentBrown(max_iterations=3)
601
+
602
+ # Example request
603
+ request = StoryboardRequest(
604
+ prompt="A moody K-pop idol finds a puppy on the street. "
605
+ "It changes everything.",
606
+ style_preference="studio_ghibli",
607
+ panels=4,
608
+ language="korean",
609
+ extras=["narration", "subtitles"],
610
+ )
611
+
612
+ # Process request
613
+ message = brown.process_request(request)
614
+ print("Generated message for Bayko:")
615
+ print(message.to_json())
616
+
617
+ # Example Bayko response (simulated)
618
+ bayko_response = {
619
+ "panels": [
620
+ {"id": 1, "description": "Idol walking alone"},
621
+ {"id": 2, "description": "Discovers puppy"},
622
+ {"id": 3, "description": "Moment of connection"},
623
+ {"id": 4, "description": "Walking together"},
624
+ ],
625
+ "style_tags": ["whimsical", "soft_lighting"],
626
+ "metadata": {"generation_time": "45s"},
627
+ }
628
+
629
+ # Review output
630
+ review_result = brown.review_output(bayko_response, request)
631
+ if review_result:
632
+ print("\nReview result:")
633
+ # Return the review result
634
+ return review_result
635
+
636
+ # Show session info
637
+ print(f"\nSession info: {brown.get_session_info()}")
638
+
639
+
640
+ if __name__ == "__main__":
641
+ main()
agents/brown_tools.py ADDED
@@ -0,0 +1,238 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Brown Agent Tools for LlamaIndex Integration
3
+ Wraps existing AgentBrown methods as LlamaIndex FunctionTools for hackathon demo
4
+ """
5
+
6
+ import json
7
+ from typing import Dict, Any, List
8
+ from llama_index.core.tools import FunctionTool
9
+ from agents.brown import AgentBrown, StoryboardRequest
10
+
11
+
12
+ class BrownTools:
13
+ """Tool wrapper class for Agent Brown's methods"""
14
+
15
+ def __init__(self, max_iterations: int = 3):
16
+ # Use your original AgentBrown class directly
17
+ self.brown = AgentBrown(max_iterations)
18
+ self._current_request = None
19
+
20
+ def validate_input_tool(
21
+ self,
22
+ prompt: str,
23
+ style_preference: str = None,
24
+ panels: int = 4,
25
+ language: str = "english",
26
+ extras: str = "[]",
27
+ ) -> str:
28
+ """Validate user input for comic generation. Returns validation status and feedback."""
29
+ try:
30
+ extras_list = json.loads(extras) if extras else []
31
+ except:
32
+ extras_list = []
33
+
34
+ request = StoryboardRequest(
35
+ prompt=prompt,
36
+ style_preference=style_preference,
37
+ panels=panels,
38
+ language=language,
39
+ extras=extras_list,
40
+ )
41
+
42
+ # Store for later use
43
+ self._current_request = request
44
+
45
+ result = self.brown.validate_input(request)
46
+
47
+ return json.dumps(
48
+ {
49
+ "status": result.status.value,
50
+ "is_valid": result.is_valid(),
51
+ "issues": result.issues,
52
+ "suggestions": result.suggestions,
53
+ "confidence_score": result.confidence_score,
54
+ }
55
+ )
56
+
57
+ def process_request_tool(
58
+ self,
59
+ prompt: str,
60
+ style_preference: str = None,
61
+ panels: int = 4,
62
+ language: str = "english",
63
+ extras: str = "[]",
64
+ ) -> str:
65
+ """Process validated request and create structured message for Agent Bayko."""
66
+ try:
67
+ extras_list = json.loads(extras) if extras else []
68
+ except:
69
+ extras_list = []
70
+
71
+ request = StoryboardRequest(
72
+ prompt=prompt,
73
+ style_preference=style_preference,
74
+ panels=panels,
75
+ language=language,
76
+ extras=extras_list,
77
+ )
78
+
79
+ # Store for later use
80
+ self._current_request = request
81
+
82
+ message = self.brown.process_request(request)
83
+
84
+ return json.dumps(
85
+ {
86
+ "message_id": message.message_id,
87
+ "session_id": self.brown.session_id,
88
+ "enhanced_prompt": message.payload.get("prompt", ""),
89
+ "style_tags": message.payload.get("style_tags", []),
90
+ "panels": message.payload.get("panels", 4),
91
+ "status": "ready_for_bayko",
92
+ }
93
+ )
94
+
95
+ def simulate_bayko_generation(self, message_data: str) -> str:
96
+ """Simulate Agent Bayko's content generation for demo purposes."""
97
+ try:
98
+ data = json.loads(message_data)
99
+ except:
100
+ data = {"panels": 4}
101
+
102
+ panels_count = data.get("panels", 4)
103
+ style_tags = data.get("style_tags", ["studio_ghibli", "soft_lighting"])
104
+ enhanced_prompt = data.get("enhanced_prompt", "")
105
+ original_prompt = data.get("original_prompt", "")
106
+
107
+ # Simulate Bayko's response with realistic image URLs for multimodal analysis
108
+ bayko_response = {
109
+ "session_id": self.brown.session_id,
110
+ "panels": [
111
+ {
112
+ "id": i + 1,
113
+ "description": f"Panel {i+1}: {original_prompt} - {', '.join(style_tags)} style",
114
+ "image_path": f"storyboard/{self.brown.session_id}/content/panel_{i+1}.png",
115
+ "image_url": f"https://example.com/generated/panel_{i+1}.png", # For multimodal analysis
116
+ "audio_path": (
117
+ f"panel_{i+1}.mp3"
118
+ if "narration" in str(data)
119
+ else None
120
+ ),
121
+ "subtitles_path": (
122
+ f"panel_{i+1}.vtt"
123
+ if "subtitles" in str(data)
124
+ else None
125
+ ),
126
+ }
127
+ for i in range(panels_count)
128
+ ],
129
+ "style_tags": style_tags,
130
+ "metadata": {
131
+ "generation_time": "45s",
132
+ "total_panels": panels_count,
133
+ "status": "completed",
134
+ "enhanced_prompt": enhanced_prompt,
135
+ "original_prompt": original_prompt,
136
+ },
137
+ }
138
+
139
+ return json.dumps(bayko_response)
140
+
141
+ def review_bayko_output_tool(
142
+ self, bayko_response_json: str, original_prompt: str
143
+ ) -> str:
144
+ """Review Agent Bayko's output and determine if refinement is needed."""
145
+ try:
146
+ bayko_response = json.loads(bayko_response_json)
147
+ except:
148
+ # Fallback response for demo
149
+ bayko_response = {
150
+ "panels": [{"id": 1, "description": "Generated content"}],
151
+ "style_tags": ["studio_ghibli"],
152
+ "metadata": {"generation_time": "45s"},
153
+ }
154
+
155
+ # Use stored request or create new one
156
+ request = self._current_request or StoryboardRequest(
157
+ prompt=original_prompt
158
+ )
159
+
160
+ result = self.brown.review_output(bayko_response, request)
161
+
162
+ if result:
163
+ # Extract decision from the result
164
+ payload = result.payload
165
+ if "approved_content" in payload:
166
+ decision = "APPROVED"
167
+ reason = "Content meets quality standards"
168
+ elif "feedback" in payload:
169
+ decision = (
170
+ payload["feedback"].get("decision", "REFINE").upper()
171
+ )
172
+ reason = payload["feedback"].get(
173
+ "reason", "Quality assessment needed"
174
+ )
175
+ else:
176
+ decision = "REFINE"
177
+ reason = "Content needs improvement"
178
+ else:
179
+ decision = "APPROVED"
180
+ reason = "Content meets quality standards"
181
+
182
+ return json.dumps(
183
+ {
184
+ "decision": decision,
185
+ "reason": reason,
186
+ "iteration": self.brown.iteration_count,
187
+ "max_iterations": self.brown.max_iterations,
188
+ "final": decision in ["APPROVED", "REJECTED"],
189
+ }
190
+ )
191
+
192
+ def get_session_info_tool(self) -> str:
193
+ """Get current session information and processing state."""
194
+ info = self.brown.get_session_info()
195
+ return json.dumps(
196
+ {
197
+ "session_id": info.get("session_id"),
198
+ "iteration_count": info.get("iteration_count", 0),
199
+ "max_iterations": info.get("max_iterations", 3),
200
+ "memory_size": info.get("memory_size", 0),
201
+ "status": "active" if info.get("session_id") else "inactive",
202
+ }
203
+ )
204
+
205
+ def create_llamaindex_tools(self) -> List[FunctionTool]:
206
+ """Create LlamaIndex FunctionTools from Brown's methods"""
207
+ return [
208
+ FunctionTool.from_defaults(
209
+ fn=self.validate_input_tool,
210
+ name="validate_input",
211
+ description="Validate user input for comic generation. MUST be called first for any user prompt. Returns validation status, issues, and suggestions.",
212
+ ),
213
+ FunctionTool.from_defaults(
214
+ fn=self.process_request_tool,
215
+ name="process_request",
216
+ description="Process validated request and create structured message for Agent Bayko. Call after validation passes. Returns enhanced prompt and generation parameters.",
217
+ ),
218
+ FunctionTool.from_defaults(
219
+ fn=self.simulate_bayko_generation,
220
+ name="simulate_bayko_generation",
221
+ description="Simulate Agent Bayko's content generation process. Takes processed request data and returns generated comic content.",
222
+ ),
223
+ FunctionTool.from_defaults(
224
+ fn=self.review_bayko_output_tool,
225
+ name="review_bayko_output",
226
+ description="Review Agent Bayko's generated content and decide if refinement is needed. Returns approval, refinement request, or rejection decision.",
227
+ ),
228
+ FunctionTool.from_defaults(
229
+ fn=self.get_session_info_tool,
230
+ name="get_session_info",
231
+ description="Get current session information and processing state. Use to track progress and iterations.",
232
+ ),
233
+ ]
234
+
235
+
236
+ def create_brown_tools(max_iterations: int = 3) -> BrownTools:
237
+ """Factory function to create BrownTools instance"""
238
+ return BrownTools(max_iterations=max_iterations)
agents/brown_workflow.py ADDED
@@ -0,0 +1,702 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Agent Brown Workflow - Streamlined for hackathon demo
3
+ """
4
+
5
+ import os
6
+ import json
7
+ import asyncio
8
+ import time
9
+ from typing import Dict, List, Optional, Any, Sequence
10
+ from datetime import datetime
11
+ from llama_index.multi_modal_llms.openai import OpenAIMultiModal
12
+ from agents.brown import AgentBrown, StoryboardRequest
13
+ from agents.brown_tools import create_brown_tools
14
+ from agents.bayko_workflow import create_agent_bayko
15
+ from llama_index.core.agent import ReActAgent
16
+ from llama_index.core.memory import ChatMemoryBuffer
17
+ from llama_index.core.tools import FunctionTool, BaseTool
18
+ from llama_index.core.llms import (
19
+ ChatMessage,
20
+ ImageBlock,
21
+ TextBlock,
22
+ MessageRole,
23
+ )
24
+ from llama_index.core.workflow import (
25
+ Event,
26
+ StartEvent,
27
+ StopEvent,
28
+ Workflow,
29
+ step,
30
+ )
31
+ from pydantic import Field
32
+
33
+ from llama_index.core.llms.llm import ToolSelection
34
+ from llama_index.core.tools.types import ToolOutput
35
+ from llama_index.core.workflow import Context
36
+ from prompts.brown_workflow_system_prompt import BROWN_WORKFLOW_SYSTEM_PROMPT
37
+
38
+
39
+ # Global LLM throttle
40
+ _llm_lock = asyncio.Lock()
41
+ _last_llm_time = 0
42
+
43
+
44
+ async def throttle_llm():
45
+ global _last_llm_time
46
+ async with _llm_lock:
47
+ now = time.time()
48
+ wait = max(0, 21 - (now - _last_llm_time))
49
+ if wait > 0:
50
+ await asyncio.sleep(wait)
51
+ _last_llm_time = time.time()
52
+
53
+
54
+ # Workflow Events
55
+ class InputEvent(Event):
56
+ def __init__(self, input: list):
57
+ super().__init__()
58
+ self.input = input
59
+
60
+
61
+ class StreamEvent(Event):
62
+ def __init__(self, delta: str):
63
+ super().__init__()
64
+ self.delta = delta
65
+
66
+
67
+ class ToolCallEvent(Event):
68
+ def __init__(self, tool_calls: list):
69
+ super().__init__()
70
+ self.tool_calls = tool_calls
71
+
72
+
73
+ class FunctionOutputEvent(Event):
74
+ def __init__(self, output):
75
+ super().__init__()
76
+ self.output = output
77
+
78
+
79
+ # Custom Events for Comic Generation Workflow
80
+ class ComicGeneratedEvent(Event):
81
+ """Event triggered when Bayko completes comic generation"""
82
+
83
+ def __init__(self, bayko_response: dict, enhanced_prompt: str):
84
+ super().__init__()
85
+ self.bayko_response = bayko_response
86
+ self.enhanced_prompt = enhanced_prompt
87
+
88
+
89
+ class CritiqueStartEvent(Event):
90
+ """Event to start Brown's critique/judging of Bayko's work"""
91
+
92
+ def __init__(self, comic_data: dict, original_prompt: str):
93
+ super().__init__()
94
+ self.comic_data = comic_data
95
+ self.original_prompt = original_prompt
96
+
97
+
98
+ class WorkflowPauseEvent(Event):
99
+ """Event to pause workflow for a specified duration"""
100
+
101
+ def __init__(
102
+ self, duration_seconds: int = 600, message: str = "Workflow paused"
103
+ ):
104
+ super().__init__()
105
+ self.duration_seconds = duration_seconds
106
+ self.message = message
107
+
108
+
109
+ class BrownFunctionCallingAgent(Workflow):
110
+ """
111
+ Agent Brown Function Calling Workflow using LlamaIndex Workflow pattern
112
+
113
+ BROWN'S RESPONSIBILITIES:
114
+ - Validate user input
115
+ - Process and enhance requests
116
+ - Coordinate with Bayko (pass messages)
117
+ - Review Bayko's output using multimodal analysis
118
+ - Make approval decisions (APPROVE/REFINE/REJECT)
119
+ - Manage iteration loop (max 2 refinements)
120
+ """
121
+
122
+ def __init__(
123
+ self,
124
+ *args: Any,
125
+ llm: OpenAIMultiModal | None = None,
126
+ tools: List[BaseTool] | None = None,
127
+ max_iterations: int = 1, # Force only one iteration
128
+ openai_api_key: Optional[str] = None,
129
+ timeout: Optional[int] = None,
130
+ **kwargs: Any,
131
+ ) -> None:
132
+ super().__init__(*args, timeout=timeout, **kwargs)
133
+
134
+ self.max_iterations = 1 # Force only one iteration
135
+
136
+ # Initialize multimodal LLM for Brown (GPT-4V for image analysis)
137
+ self.llm = llm or OpenAIMultiModal(
138
+ model="gpt-4o",
139
+ api_key=openai_api_key or os.getenv("OPENAI_API_KEY"),
140
+ temperature=0.7,
141
+ max_tokens=2048,
142
+ additional_kwargs={"tool_choice": "required"},
143
+ )
144
+
145
+ # Ensure it's a function calling model
146
+ assert self.llm.metadata.is_function_calling_model
147
+
148
+ # Create ONLY Brown's tools (validation, processing, review)
149
+ self.tools = tools or self._create_brown_tools()
150
+
151
+ # Initialize Bayko workflow for content generation
152
+ self.bayko_workflow = create_agent_bayko(openai_api_key=openai_api_key)
153
+
154
+ def _create_brown_tools(self) -> List[FunctionTool]:
155
+ """Create ONLY Brown's tools - validation, processing, review, coordination"""
156
+
157
+ # Get Brown's core tools (validation, processing, review)
158
+ brown_tools_instance = create_brown_tools(self.max_iterations)
159
+ brown_tools = brown_tools_instance.create_llamaindex_tools()
160
+
161
+ # Add coordination tool to communicate with Bayko
162
+ async def coordinate_with_bayko_tool(
163
+ enhanced_prompt: str,
164
+ style_tags: str = "[]",
165
+ panels: int = 4,
166
+ language: str = "english",
167
+ extras: str = "[]",
168
+ ) -> str:
169
+ """
170
+ Send the enhanced comic request to Agent Bayko for actual comic content generation.
171
+ This is the ONLY way to generate comic panels and story content.
172
+ Always use this tool when the user prompt or workflow requires new comic content.
173
+
174
+ Arguments:
175
+ enhanced_prompt: The improved user prompt for the comic.
176
+ style_tags: JSON list of style tags (e.g., '["manga", "noir"]').
177
+ panels: Number of comic panels to generate.
178
+ language: Language for the comic.
179
+ extras: JSON list of extra instructions.
180
+
181
+ Returns:
182
+ JSON string with Bayko's response and status. Example:
183
+ '{"status": "bayko_generation_complete", "bayko_response": {...}, ...}'
184
+ """
185
+ try:
186
+ # Parse inputs
187
+ style_list = json.loads(style_tags) if style_tags else []
188
+ extras_list = json.loads(extras) if extras else []
189
+
190
+ # Create message for Bayko
191
+ bayko_request = {
192
+ "enhanced_prompt": enhanced_prompt,
193
+ "style_tags": style_list,
194
+ "panels": panels,
195
+ "language": language,
196
+ "extras": extras_list,
197
+ "session_id": "brown_coordination",
198
+ }
199
+
200
+ # Call Bayko workflow to generate content
201
+ # If Bayko is async, use: bayko_result = await self.bayko_workflow.process_generation_request(bayko_request)
202
+ bayko_result = self.bayko_workflow.process_generation_request(
203
+ bayko_request
204
+ )
205
+
206
+ return json.dumps(
207
+ {
208
+ "status": "bayko_generation_complete",
209
+ "bayko_response": bayko_result,
210
+ "panels_generated": panels,
211
+ "coordination_successful": True,
212
+ }
213
+ )
214
+
215
+ except Exception as e:
216
+ print(f"[Brown] Bayko coordination failed: {e}") # Debug log
217
+ return json.dumps(
218
+ {
219
+ "status": "bayko_coordination_failed",
220
+ "error": str(e),
221
+ "coordination_successful": False,
222
+ }
223
+ )
224
+
225
+ # Add multimodal image analysis tool for Brown to judge Bayko's work
226
+ def analyze_bayko_output_tool(
227
+ bayko_response: str, original_prompt: str = ""
228
+ ) -> str:
229
+ """Analyze Bayko's generated content using Brown's multimodal capabilities."""
230
+ try:
231
+ # Parse Bayko's response
232
+ bayko_data = json.loads(bayko_response)
233
+
234
+ # Extract image URLs/paths for analysis
235
+ image_urls = []
236
+ panels = bayko_data.get("panels", [])
237
+
238
+ for panel in panels:
239
+ if "image_url" in panel:
240
+ image_urls.append(panel["image_url"])
241
+ elif "image_path" in panel:
242
+ # Convert local path to file URL for analysis
243
+ image_urls.append(f"file://{panel['image_path']}")
244
+
245
+ if not image_urls:
246
+ return json.dumps(
247
+ {
248
+ "analysis": "No images found in Bayko's output",
249
+ "decision": "REJECT",
250
+ "reason": "Missing visual content",
251
+ }
252
+ )
253
+
254
+ # Create multimodal analysis prompt
255
+ analysis_prompt = f"""Analyze this comic content generated by Agent Bayko.
256
+
257
+ Original User Prompt: "{original_prompt}"
258
+
259
+ Bayko's Generated Content: {json.dumps(bayko_data, indent=2)}
260
+
261
+ As Agent Brown, evaluate:
262
+ 1. Visual Quality: Are the images well-composed and clear?
263
+ 2. Style Consistency: Does it match the requested style?
264
+ 3. Story Coherence: Do the panels tell a logical story?
265
+ 4. Prompt Adherence: Does it fulfill the user's request?
266
+ 5. Technical Quality: Are all assets properly generated?
267
+
268
+ Make a decision: APPROVE (ready for user), REFINE (needs improvement), or REJECT (start over)
269
+ Provide specific feedback for any issues."""
270
+
271
+ # Create multimodal message with text and images
272
+ from typing import Union
273
+
274
+ blocks: List[Union[TextBlock, ImageBlock]] = [
275
+ TextBlock(text=analysis_prompt)
276
+ ]
277
+
278
+ # Add image blocks for visual analysis (limit to 4 for API constraints)
279
+ for url in image_urls[:4]:
280
+ blocks.append(ImageBlock(url=url))
281
+
282
+ msg = ChatMessage(role=MessageRole.USER, blocks=blocks)
283
+
284
+ # Get Brown's multimodal analysis
285
+ response = self.llm.chat(messages=[msg])
286
+ response_text = str(response).lower()
287
+
288
+ # Parse Brown's decision
289
+ if (
290
+ "approve" in response_text
291
+ and "reject" not in response_text
292
+ ):
293
+ decision = "APPROVE"
294
+ elif "reject" in response_text:
295
+ decision = "REJECT"
296
+ else:
297
+ decision = "REFINE"
298
+
299
+ return json.dumps(
300
+ {
301
+ "analysis": str(response),
302
+ "decision": decision,
303
+ "images_analyzed": len(image_urls),
304
+ "multimodal_analysis": True,
305
+ "brown_judgment": True,
306
+ }
307
+ )
308
+
309
+ except Exception as e:
310
+ return json.dumps(
311
+ {
312
+ "analysis": f"Analysis failed: {str(e)}",
313
+ "decision": "APPROVE", # Default to approval on error
314
+ "error": str(e),
315
+ }
316
+ )
317
+
318
+ # Add Brown's coordination and analysis tools
319
+ coordination_tools = [
320
+ FunctionTool.from_defaults(
321
+ async_fn=coordinate_with_bayko_tool,
322
+ name="coordinate_with_bayko",
323
+ description="Send the enhanced comic request to Agent Bayko for actual comic content generation. This is the ONLY way to generate comic panels and story content. Always use this tool when the user prompt or workflow requires new comic content. Returns a JSON string with Bayko's response and status.",
324
+ ),
325
+ FunctionTool.from_defaults(
326
+ fn=analyze_bayko_output_tool,
327
+ name="analyze_bayko_output",
328
+ description="Analyze Bayko's generated content using Brown's multimodal capabilities. Make approval decision.",
329
+ ),
330
+ ]
331
+
332
+ # Combine Brown's core tools with coordination tools
333
+ all_brown_tools = brown_tools + coordination_tools
334
+ return all_brown_tools
335
+
336
+ @step
337
+ async def prepare_chat_history(
338
+ self, ctx: Context, ev: StartEvent
339
+ ) -> InputEvent:
340
+ """Prepare chat history and handle user input"""
341
+ # Clear sources and initialize iteration counter
342
+ await ctx.set("sources", [])
343
+ await ctx.set("iteration_count", 0)
344
+
345
+ # Store original user prompt in context
346
+ await ctx.set("original_prompt", ev.input)
347
+
348
+ # Check if memory is setup
349
+ memory = await ctx.get("memory", default=None)
350
+ if not memory:
351
+ memory = ChatMemoryBuffer.from_defaults(llm=self.llm)
352
+
353
+ # Add system message if not present
354
+ if not memory.get() or memory.get()[0].role != "system":
355
+ system_msg = ChatMessage(
356
+ role="system", content=BROWN_WORKFLOW_SYSTEM_PROMPT
357
+ )
358
+ memory.put(system_msg)
359
+
360
+ # Get user input and add to memory
361
+ user_msg = ChatMessage(role="user", content=ev.input)
362
+ memory.put(user_msg)
363
+
364
+ # Update context with memory
365
+ await ctx.set("memory", memory)
366
+
367
+ # Return chat history
368
+ return InputEvent(input=memory.get())
369
+
370
+ @step
371
+ async def enhance_and_send_to_bayko(
372
+ self, ctx: Context, ev: StartEvent
373
+ ) -> InputEvent:
374
+ """
375
+ SINGLE STEP: Enhance prompt with ONE LLM call and send to Bayko.
376
+ NO TOOL CALLING LOOP - DIRECT PROCESSING ONLY.
377
+ """
378
+ original_prompt = ev.input
379
+ print(f"Single-step processing: {original_prompt[:50]}...")
380
+
381
+ # Create simple enhancement prompt (NO TOOL CALLING)
382
+ enhancement_prompt = f"""Enhance this comic story prompt for visual storytelling:
383
+
384
+ Original: {original_prompt}
385
+
386
+ Provide ONLY an enhanced story description with visual details, mood, and style suggestions.
387
+ Keep it concise and focused on the core narrative.
388
+ DO NOT use any tools or functions - just return the enhanced prompt text."""
389
+
390
+ try:
391
+ # SINGLE LLM CALL - No tools, no streaming, no complexity
392
+ print("🚀 Making SINGLE OpenAI call...")
393
+
394
+ # Use simple chat completion without tools
395
+ simple_llm = OpenAIMultiModal(
396
+ model="gpt-4o",
397
+ api_key=self.llm.api_key,
398
+ temperature=0.7,
399
+ max_tokens=500, # Shorter response
400
+ # NO tool_choice - allows free response
401
+ )
402
+
403
+ response = await simple_llm.achat(
404
+ [ChatMessage(role="user", content=enhancement_prompt)]
405
+ )
406
+
407
+ enhanced_prompt = response.message.content or enhancement_prompt
408
+ print(f"🚀 Enhanced: {enhanced_prompt[:100]}...")
409
+
410
+ # Send directly to Bayko
411
+ bayko_request = {
412
+ "prompt": enhanced_prompt,
413
+ "original_prompt": original_prompt,
414
+ "style_tags": ["comic", "storytelling"],
415
+ "panels": 4,
416
+ "language": "english",
417
+ "extras": [],
418
+ "session_id": "hackathon_session",
419
+ }
420
+
421
+ print("🚀 Calling Bayko and waiting for image generation...")
422
+ # PROPERLY AWAIT Bayko's async image generation
423
+ bayko_result = self.bayko_workflow.process_generation_request(
424
+ bayko_request
425
+ )
426
+
427
+ print("🚀 SUCCESS! Comic generated and images ready!")
428
+ print("📁 Check your storyboard folder for images and logs!")
429
+ print("🎉 Ready for next prompt!")
430
+
431
+ # DON'T STOP - Let user enter new prompt
432
+ return InputEvent(
433
+ input=[
434
+ ChatMessage(
435
+ role="assistant",
436
+ content=f"✅ Comic generated successfully! Enhanced prompt: {enhanced_prompt[:100]}... Images saved to storyboard folder. Ready for your next comic idea!",
437
+ )
438
+ ]
439
+ )
440
+
441
+ except Exception as e:
442
+ print(f"🚨Error: {e}")
443
+ # DON'T STOP on error either - let user try again
444
+ return InputEvent(
445
+ input=[
446
+ ChatMessage(
447
+ role="assistant",
448
+ content=f"❌ Error generating comic: {str(e)}. Please try again with a different prompt.",
449
+ )
450
+ ]
451
+ )
452
+
453
+ @step
454
+ async def handle_llm_input(
455
+ self, ctx: Context, ev: InputEvent
456
+ ) -> ToolCallEvent | StopEvent:
457
+ """Handle LLM input and determine if tools need to be called"""
458
+ import asyncio
459
+
460
+ chat_history = ev.input
461
+
462
+ # Add system prompt for Agent Brown
463
+ system_prompt = BROWN_WORKFLOW_SYSTEM_PROMPT
464
+
465
+ # Add system message if not present
466
+ if not chat_history or chat_history[0].role != "system":
467
+ system_msg = ChatMessage(role="system", content=system_prompt)
468
+ chat_history = [system_msg] + chat_history
469
+
470
+ await throttle_llm()
471
+ # Stream the response
472
+ response_stream = await self.llm.astream_chat_with_tools(
473
+ self.tools, chat_history=chat_history
474
+ )
475
+ async for response in response_stream:
476
+ ctx.write_event_to_stream(StreamEvent(delta=response.delta or ""))
477
+
478
+ # Save the final response, which should have all content
479
+ memory = await ctx.get("memory")
480
+ memory.put(response.message)
481
+ await ctx.set("memory", memory)
482
+
483
+ # Get tool calls
484
+ tool_calls = self.llm.get_tool_calls_from_response(
485
+ response, error_on_no_tool_call=True
486
+ )
487
+
488
+ if not tool_calls:
489
+ # If no tool call, error out immediately
490
+ raise RuntimeError(
491
+ "Agent Brown did not use the required tool. The workflow will stop."
492
+ )
493
+ else:
494
+ return ToolCallEvent(tool_calls=tool_calls)
495
+
496
+ @step
497
+ async def handle_tool_calls(
498
+ self, ctx: Context, ev: ToolCallEvent
499
+ ) -> InputEvent:
500
+ """Handle tool calls with proper error handling (supports async tools)"""
501
+ import inspect
502
+ import asyncio
503
+
504
+ tool_calls = ev.tool_calls
505
+ tools_by_name = {tool.metadata.get_name(): tool for tool in self.tools}
506
+
507
+ tool_msgs = []
508
+ sources = await ctx.get("sources", default=[])
509
+
510
+ # Call tools -- safely!
511
+ for tool_call in tool_calls:
512
+ tool = tools_by_name.get(tool_call.tool_name)
513
+ if not tool:
514
+ additional_kwargs = {
515
+ "tool_call_id": tool_call.tool_id,
516
+ "name": tool_call.tool_name,
517
+ }
518
+ tool_msgs.append(
519
+ ChatMessage(
520
+ role="tool",
521
+ content=f"Tool {tool_call.tool_name} does not exist",
522
+ additional_kwargs=additional_kwargs,
523
+ )
524
+ )
525
+ continue
526
+
527
+ additional_kwargs = {
528
+ "tool_call_id": tool_call.tool_id,
529
+ "name": tool.metadata.get_name(),
530
+ }
531
+
532
+ try:
533
+ # Check if tool is async and call appropriately
534
+ if inspect.iscoroutinefunction(tool):
535
+ tool_output = await tool(**tool_call.tool_kwargs)
536
+ else:
537
+ tool_output = tool(**tool_call.tool_kwargs)
538
+
539
+ sources.append(tool_output)
540
+ tool_msgs.append(
541
+ ChatMessage(
542
+ role="tool",
543
+ content=(
544
+ tool_output.content
545
+ if hasattr(tool_output, "content")
546
+ else str(tool_output)
547
+ ),
548
+ additional_kwargs=additional_kwargs,
549
+ )
550
+ )
551
+ # Throttle after each tool call
552
+ await asyncio.sleep(21)
553
+ except Exception as e:
554
+ tool_msgs.append(
555
+ ChatMessage(
556
+ role="tool",
557
+ content=f"Encountered error in tool call: {e}",
558
+ additional_kwargs=additional_kwargs,
559
+ )
560
+ )
561
+
562
+ # Update memory
563
+ memory = await ctx.get("memory")
564
+ for msg in tool_msgs:
565
+ memory.put(msg)
566
+
567
+ await ctx.set("sources", sources)
568
+ await ctx.set("memory", memory)
569
+
570
+ chat_history = memory.get()
571
+ return InputEvent(input=chat_history)
572
+
573
+
574
+ class BrownWorkflow:
575
+ """
576
+ Wrapper class for backward compatibility with existing code
577
+ """
578
+
579
+ def __init__(
580
+ self, max_iterations: int = 1, openai_api_key: Optional[str] = None
581
+ ):
582
+ self.max_iterations = 1
583
+ self.openai_api_key = openai_api_key
584
+
585
+ # Get Brown's tools
586
+ brown_tools = create_brown_tools(max_iterations)
587
+ self.tools: Sequence[BaseTool] = brown_tools.create_llamaindex_tools()
588
+
589
+ # Initialize LLM
590
+ self.llm = OpenAIMultiModal(
591
+ model="gpt-4o",
592
+ api_key=openai_api_key or os.getenv("OPENAI_API_KEY"),
593
+ temperature=0.7,
594
+ max_tokens=2048,
595
+ additional_kwargs={"tool_choice": "required"},
596
+ )
597
+
598
+ def reset(self):
599
+ """Reset workflow state for a new session."""
600
+ self.session_id = None
601
+ # Reset agent instances
602
+ brown_tools = create_brown_tools(self.max_iterations)
603
+ self.tools = brown_tools.create_llamaindex_tools()
604
+ # Reset LLM if needed
605
+ self.llm = OpenAIMultiModal(
606
+ model="gpt-4o",
607
+ api_key=self.openai_api_key or os.getenv("OPENAI_API_KEY"),
608
+ temperature=0.7,
609
+ max_tokens=2048,
610
+ additional_kwargs={"tool_choice": "required"},
611
+ )
612
+
613
+ async def process_comic_request_async(
614
+ self, user_prompt: str
615
+ ) -> Dict[str, Any]:
616
+ """Async version of comic request processing"""
617
+ try:
618
+ # Create workflow agent
619
+ workflow = BrownFunctionCallingAgent(
620
+ llm=self.llm,
621
+ tools=list(self.tools), # Convert Sequence to List
622
+ max_iterations=self.max_iterations,
623
+ openai_api_key=self.openai_api_key,
624
+ timeout=None,
625
+ verbose=True,
626
+ )
627
+
628
+ # Run workflow with user prompt as input (LlamaIndex pattern)
629
+ result = await workflow.run(input=user_prompt)
630
+
631
+ # Parse response
632
+ response = result.get("response")
633
+ if not response:
634
+ return {
635
+ "status": "error",
636
+ "error": "No response from workflow",
637
+ }
638
+
639
+ # Extract relevant data
640
+ tool_outputs = result.get("sources", [])
641
+ bayko_outputs = [
642
+ out
643
+ for out in tool_outputs
644
+ if "bayko_response" in str(out.content)
645
+ ]
646
+
647
+ if not bayko_outputs:
648
+ return {
649
+ "status": "error",
650
+ "error": "No content generated by Bayko",
651
+ }
652
+
653
+ # Get last Bayko output (final version)
654
+ try:
655
+ final_output = json.loads(bayko_outputs[-1].content)
656
+ bayko_response = final_output.get("bayko_response", {})
657
+
658
+ return {
659
+ "status": "success",
660
+ "bayko_response": bayko_response,
661
+ "decision": final_output.get("decision", "APPROVE"),
662
+ "analysis": final_output.get("analysis", ""),
663
+ "tool_outputs": [str(out.content) for out in tool_outputs],
664
+ }
665
+ except json.JSONDecodeError as e:
666
+ return {
667
+ "status": "error",
668
+ "error": f"Failed to parse Bayko response: {str(e)}",
669
+ }
670
+
671
+ except Exception as e:
672
+ return {"status": "error", "error": str(e)}
673
+
674
+ def process_comic_request(self, user_prompt: str) -> Dict[str, Any]:
675
+ """Synchronous version that runs the async version"""
676
+ import asyncio
677
+
678
+ return asyncio.run(self.process_comic_request_async(user_prompt))
679
+
680
+
681
+ def create_brown_workflow(
682
+ max_iterations: int = 1, openai_api_key: Optional[str] = None
683
+ ) -> BrownWorkflow:
684
+ """
685
+ Factory function to create and initialize Brown workflow
686
+ """
687
+ return BrownWorkflow(
688
+ max_iterations=max_iterations, openai_api_key=openai_api_key
689
+ )
690
+
691
+
692
+ # def create_brown_workflow(
693
+ # max_iterations: int = 3,
694
+ # openai_api_key: Optional[str] = None,
695
+ # ) -> BrownFunctionCallingAgent:
696
+ # """Create and initialize Brown workflow"""
697
+
698
+ # workflow = BrownFunctionCallingAgent(
699
+ # max_iterations=max_iterations,
700
+ # openai_api_key=openai_api_key,
701
+ # )
702
+ # return workflow
app.py ADDED
@@ -0,0 +1,400 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ import time
3
+ import json
4
+ import os
5
+ import sys
6
+ import io
7
+ import logging
8
+ import threading
9
+ import queue
10
+ from contextlib import redirect_stdout, redirect_stderr
11
+ from agents.brown_workflow import create_brown_workflow
12
+ from pathlib import Path
13
+
14
+
15
+ # Initialize workflow
16
+ workflow = create_brown_workflow(
17
+ max_iterations=3,
18
+ openai_api_key=os.getenv("OPENAI_API_KEY"),
19
+ )
20
+
21
+
22
+ def colorize_message(message, msg_type="default"):
23
+ """Add color coding to messages based on type"""
24
+ colors = {
25
+ "agent_brown": "#2E86AB", # Blue for Agent Brown
26
+ "agent_bayko": "#A23B72", # Purple for Agent Bayko
27
+ "tool_output": "#F18F01", # Orange for tool outputs
28
+ "error": "#C73E1D", # Red for errors
29
+ "success": "#2D5016", # Green for success
30
+ "log": "#6C757D", # Gray for logs
31
+ "analysis": "#8E44AD", # Purple for analysis
32
+ "default": "#212529", # Dark gray default
33
+ }
34
+
35
+ color = colors.get(msg_type, colors["default"])
36
+ return f'<span style="color: {color}; font-weight: bold;">{message}</span>'
37
+
38
+
39
+ class LogCapture(io.StringIO):
40
+ """Capture all print statements and logs for Gradio display"""
41
+
42
+ def __init__(self, log_queue):
43
+ super().__init__()
44
+ self.log_queue = log_queue
45
+ self.original_stdout = sys.stdout
46
+
47
+ def write(self, text):
48
+ if text.strip(): # Only capture non-empty lines
49
+ self.log_queue.put(text.strip())
50
+ # Also write to original stdout so terminal still shows logs
51
+ self.original_stdout.write(text)
52
+ return len(text)
53
+
54
+ def flush(self):
55
+ self.original_stdout.flush()
56
+
57
+
58
+ def comic_generator(prompt, style_preference, verbose=True):
59
+ """Verbose comic generation: stream all agent/tool messages and reasoning"""
60
+ chat = []
61
+ tool_calls = []
62
+ images = []
63
+ progress = 0
64
+
65
+ # Create log queue and capture
66
+ log_queue = queue.Queue()
67
+ log_capture = LogCapture(log_queue)
68
+
69
+ try:
70
+ # Start message
71
+ chat.append(
72
+ (
73
+ colorize_message(
74
+ "🧐 Agent Brown: Starting comic generation...",
75
+ "agent_brown",
76
+ ),
77
+ )
78
+ )
79
+ progress += 5
80
+ yield chat, images
81
+
82
+ # Start workflow in a separate thread to capture logs in real-time
83
+ def run_workflow():
84
+ with redirect_stdout(log_capture):
85
+ return workflow.process_comic_request(
86
+ f"{prompt} Style preference: {style_preference}"
87
+ )
88
+
89
+ # Run workflow in thread
90
+ workflow_thread = threading.Thread(
91
+ target=lambda: setattr(run_workflow, "result", run_workflow())
92
+ )
93
+ workflow_thread.start()
94
+
95
+ # Stream logs while workflow is running
96
+ while workflow_thread.is_alive():
97
+ try:
98
+ # Get logs from queue with timeout
99
+ log_message = log_queue.get(timeout=0.1)
100
+ chat.append((colorize_message(f"📝 {log_message}", "log"),))
101
+ progress = min(progress + 2, 90)
102
+ yield chat, images
103
+ except queue.Empty:
104
+ continue
105
+
106
+ # Wait for thread to complete
107
+ workflow_thread.join()
108
+
109
+ # Get any remaining logs
110
+ while not log_queue.empty():
111
+ try:
112
+ log_message = log_queue.get_nowait()
113
+ chat.append((colorize_message(f"📝 {log_message}", "log"),))
114
+ yield chat, images
115
+ except queue.Empty:
116
+ break
117
+
118
+ # Get the result
119
+ result = getattr(run_workflow, "result", None)
120
+ if not result:
121
+ chat.append(("❌ No result from workflow",))
122
+ yield chat, images
123
+ return
124
+
125
+ response_data = (
126
+ json.loads(result) if isinstance(result, str) else result
127
+ )
128
+
129
+ # Show all tool outputs and chat history if available
130
+ tool_outputs = response_data.get("tool_outputs", [])
131
+ for tool_output in tool_outputs:
132
+ # Try to pretty-print tool output JSON if possible
133
+ try:
134
+ tool_json = json.loads(tool_output)
135
+ tool_msg = json.dumps(tool_json, indent=2)
136
+ except Exception:
137
+ tool_msg = str(tool_output)
138
+ chat.append(
139
+ (
140
+ colorize_message(
141
+ f"🛠️ Tool Output:\n{tool_msg}", "tool_output"
142
+ ),
143
+ )
144
+ )
145
+ tool_calls.append("tool_call")
146
+ progress = min(progress + 10, 95)
147
+ yield chat, images
148
+
149
+ # Show error if any
150
+ if "error" in response_data:
151
+ chat.append(
152
+ (
153
+ colorize_message(
154
+ f"❌ Error: {response_data['error']}", "error"
155
+ ),
156
+ )
157
+ )
158
+ progress = 100
159
+ yield chat, images
160
+ return
161
+
162
+ # Show Bayko's panel generation
163
+ if "bayko_response" in response_data:
164
+ bayko_data = response_data["bayko_response"]
165
+ panels = bayko_data.get("panels", [])
166
+ progress_per_panel = 50 / max(len(panels), 1)
167
+ for i, panel in enumerate(panels, 1):
168
+ chat.append(
169
+ (
170
+ colorize_message(
171
+ f"🧸 Agent Bayko: Panel {i}: {panel.get('caption', '')}",
172
+ "agent_bayko",
173
+ ),
174
+ )
175
+ )
176
+ tool_calls.append("generate_panel_content")
177
+ # Show image if available
178
+ if "image_path" in panel:
179
+ img_path = Path(panel["image_path"])
180
+ if img_path.exists():
181
+ images.append(str(img_path.absolute()))
182
+ elif "image_url" in panel:
183
+ images.append(panel["image_url"])
184
+ progress += progress_per_panel
185
+ yield chat, images
186
+ time.sleep(0.2)
187
+
188
+ # Show Brown's analysis and decision
189
+ if "analysis" in response_data:
190
+ chat.append(
191
+ (
192
+ colorize_message(
193
+ f"🧐 Agent Brown Analysis: {response_data['analysis']}",
194
+ "analysis",
195
+ ),
196
+ )
197
+ )
198
+ tool_calls.append("analyze_bayko_output")
199
+ progress = min(progress + 10, 99)
200
+ yield chat, images
201
+
202
+ # Final decision
203
+ if "decision" in response_data:
204
+ decision = response_data["decision"]
205
+ if decision == "APPROVE":
206
+ chat.append(
207
+ (
208
+ "✅ Agent Brown: Comic approved! All quality checks passed.",
209
+ )
210
+ )
211
+ elif decision == "REFINE":
212
+ chat.append(
213
+ (
214
+ "🔄 Agent Brown: Comic needs refinement. Starting another iteration...",
215
+ )
216
+ )
217
+ else:
218
+ chat.append(
219
+ ("❌ Agent Brown: Comic rejected. Starting over...",)
220
+ )
221
+ tool_calls.append("final_decision")
222
+ progress = 100
223
+ yield chat, images
224
+
225
+ # If verbose, show the full response_data for debugging
226
+ if verbose:
227
+ chat.append(
228
+ (
229
+ f"[DEBUG] Full response: {json.dumps(response_data, indent=2)}",
230
+ )
231
+ )
232
+ yield chat, images
233
+
234
+ except Exception as e:
235
+ chat.append(
236
+ (
237
+ colorize_message(
238
+ f"❌ Error during generation: {str(e)}", "error"
239
+ ),
240
+ )
241
+ )
242
+ progress = 100
243
+ yield chat, images
244
+
245
+
246
+ def set_api_key(api_key):
247
+ """Set the OpenAI API key as environment variable"""
248
+ if not api_key or not api_key.strip():
249
+ return colorize_message("❌ Please enter a valid API key", "error")
250
+
251
+ if not api_key.startswith("sk-"):
252
+ return colorize_message(
253
+ "❌ Invalid API key format (should start with 'sk-')", "error"
254
+ )
255
+
256
+ # Set the environment variable
257
+ os.environ["OPENAI_API_KEY"] = api_key.strip()
258
+
259
+ # Update the global workflow with the new key
260
+ global workflow
261
+ workflow = create_brown_workflow(
262
+ max_iterations=3,
263
+ openai_api_key=api_key.strip(),
264
+ )
265
+
266
+ return colorize_message("✅ API key set successfully!", "success")
267
+
268
+
269
+ # Gradio UI
270
+ with gr.Blocks() as demo:
271
+ gr.Markdown(
272
+ """
273
+ ⚠️ **Warning:** This demo is subject to OpenAI's strict rate limits (3 requests/min for gpt-4o). You may experience long waits (20+ seconds) between steps. If you see a rate limit error, please wait and try again, or use your own OpenAI API key with higher limits.
274
+ """
275
+ )
276
+
277
+ gr.Markdown(
278
+ """
279
+ # 🦙 Multi-Agent Comic Generator
280
+ Enter a story prompt and let Brown & Bayko create a comic!
281
+ Watch as the agents collaborate using LlamaIndex workflow and GPT-4V vision capabilities.
282
+ """
283
+ )
284
+
285
+ with gr.Row():
286
+ openai_key_box = gr.Textbox(
287
+ label="Enter your OpenAI API Key (optional)",
288
+ placeholder="sk-...",
289
+ type="password",
290
+ scale=4,
291
+ )
292
+ set_key_button = gr.Button("Set API Key 🔑", scale=1)
293
+ key_status = gr.Textbox(
294
+ label="Status",
295
+ value="No API key set",
296
+ interactive=False,
297
+ scale=2,
298
+ )
299
+
300
+ with gr.Row():
301
+ user_input = gr.Textbox(
302
+ label="Enter your comic prompt",
303
+ placeholder="A moody K-pop idol finds a puppy on the street...",
304
+ scale=4,
305
+ )
306
+ style_dropdown = gr.Dropdown(
307
+ ["Studio Ghibli", "Noir", "Manga", "Pixel Art"],
308
+ label="Art Style",
309
+ value="Studio Ghibli",
310
+ )
311
+ submit_button = gr.Button("Generate Comic 🎨")
312
+
313
+ with gr.Row():
314
+ chat_window = gr.Chatbot(
315
+ label="Agent Conversation",
316
+ bubble_full_width=False,
317
+ show_copy_button=True,
318
+ height=350,
319
+ )
320
+
321
+ with gr.Row():
322
+ image_gallery = gr.Gallery(
323
+ label="Comic Panels",
324
+ columns=2,
325
+ rows=2,
326
+ height=400,
327
+ object_fit="contain",
328
+ )
329
+ feedback = gr.Radio(
330
+ ["👍 Love it!", "👎 Try again"],
331
+ label="How's the comic?",
332
+ value=None,
333
+ )
334
+
335
+ def stream_comic(openai_key, prompt, style):
336
+ # Use user-provided OpenAI key if given, else fallback to env
337
+ key = openai_key or os.getenv("OPENAI_API_KEY")
338
+
339
+ # Set the API key as environment variable if provided
340
+ if openai_key:
341
+ os.environ["OPENAI_API_KEY"] = openai_key
342
+
343
+ # Re-create the workflow with the user key
344
+ workflow = create_brown_workflow(
345
+ max_iterations=3,
346
+ openai_api_key=key,
347
+ )
348
+ for chat, images in comic_generator(prompt, style, verbose=True):
349
+ chat_display = []
350
+ for msg in chat:
351
+ if isinstance(msg, tuple) and len(msg) >= 1:
352
+ # If it's a single-element tuple, make it a proper chat message
353
+ if len(msg) == 1:
354
+ chat_display.append((msg[0], ""))
355
+ else:
356
+ chat_display.append(
357
+ (msg[0], msg[1] if len(msg) > 1 else "")
358
+ )
359
+ else:
360
+ # Handle string messages
361
+ chat_display.append((str(msg), ""))
362
+ yield chat_display, images
363
+
364
+ submit_button.click(
365
+ stream_comic,
366
+ inputs=[openai_key_box, user_input, style_dropdown],
367
+ outputs=[chat_window, image_gallery],
368
+ )
369
+
370
+ set_key_button.click(
371
+ set_api_key,
372
+ inputs=[openai_key_box],
373
+ outputs=[key_status],
374
+ )
375
+
376
+ gr.Markdown(
377
+ """
378
+ ---
379
+ <center>
380
+ <b>Built with 🦙 LlamaIndex, Modal Labs, MistralAI and Gradio for the Hugging Face Hackathon!</b>
381
+ <br>Using GPT-4 for intelligent comic analysis
382
+ </center>
383
+ """
384
+ )
385
+
386
+ gr.Markdown(
387
+ """
388
+ **🚀 Want the real version?**
389
+ 1. Clone this Space
390
+ 2. Set up Modal credentials
391
+ 3. Deploy with actual Modal functions
392
+ 4. Enjoy serverless AI magic!
393
+
394
+ *This demo shows the UI and MCP structure without Modal execution.*
395
+ """
396
+ )
397
+
398
+ # Launch the app
399
+ if __name__ == "__main__":
400
+ demo.launch()
prompts/bayko_workflow_system_prompt.py ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ BAYKO_WORKFLOW_SYSTEM_PROMPT = """You are Agent Bayko, the creative content generation specialist in a multi-agent comic generation system.
2
+
3
+ 🎯 MISSION: Transform Agent Brown's structured requests into high-quality comic content using LLM-enhanced prompts and AI-powered generation.
4
+
5
+ 🔄 WORKFLOW (MUST FOLLOW IN ORDER):
6
+ 1. RECEIVE structured request from Agent Brown with panel descriptions and style requirements
7
+ 2. ENHANCE prompts using generate_enhanced_prompt tool - create SDXL-optimized prompts
8
+ 3. GENERATE content using generate_panel_content tool - create images, audio, subtitles
9
+ 4. SAVE LLM data using save_llm_data tool - persist generation and revision data
10
+ 5. RESPOND with completed content and metadata
11
+
12
+ 🛠️ TOOLS AVAILABLE:
13
+ - generate_enhanced_prompt: Create LLM-enhanced prompts for SDXL image generation
14
+ - revise_panel_description: Improve descriptions based on feedback using LLM
15
+ - generate_panel_content: Generate complete panel content (images, audio, subtitles)
16
+ - get_session_info: Track session state and generation statistics
17
+ - save_llm_data: Persist LLM generation and revision data to session storage
18
+
19
+ 🧠 LLM ENHANCEMENT:
20
+ - Use LLM to create detailed, vivid prompts for better image generation
21
+ - Incorporate all style tags, mood, and metadata from Agent Brown
22
+ - Generate SDXL-compatible prompts with proper formatting
23
+ - Apply intelligent refinements based on feedback
24
+
25
+ 🎨 CONTENT GENERATION:
26
+ - Create comic panel images using enhanced prompts
27
+ - Generate audio narration when requested
28
+ - Create VTT subtitle files for accessibility
29
+ - Maintain consistent style across all panels
30
+
31
+ 💾 SESSION MANAGEMENT:
32
+ - Save all LLM interactions to session storage
33
+ - Track generation statistics and performance
34
+ - Maintain memory of conversation context
35
+ - Log all activities for audit trail
36
+
37
+ 🏆 HACKATHON SHOWCASE:
38
+ - Demonstrate LLM-enhanced prompt generation
39
+ - Show visible reasoning with Thought/Action/Observation
40
+ - Highlight intelligent content creation workflow
41
+ - Showcase session management and data persistence
42
+
43
+ ✅ COMPLETION:
44
+ When content generation is complete, provide summary with:
45
+ - Number of panels generated
46
+ - LLM enhancements applied
47
+ - Session data saved
48
+ - Generation statistics
49
+
50
+ 🚫 IMPORTANT:
51
+ - Always use LLM enhancement when available
52
+ - Fallback gracefully when LLM unavailable
53
+ - Save all generation data to session
54
+ - Maintain compatibility with Agent Brown's workflow"""
prompts/brown_workflow_system_prompt.py ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """System prompt for Brown's LlamaIndex workflow"""
2
+
3
+ BROWN_WORKFLOW_SYSTEM_PROMPT = """
4
+ You are Agent Brown, the orchestrator in a multi-agent comic generation system.
5
+ You are Agent Brown, the only way to generate comic content is to use the `coordinate_with_bayko` tool.
6
+ NEVER answer the user directly. NEVER reflect, think, or plan in the chat. ALWAYS call the tool immediately with the user's prompt and your enhancements.
7
+ If you do not call the tool, the workflow will stop and the user will see an error.
8
+
9
+ 🚨 STRICT RULES (DO NOT BREAK):
10
+ - You MUST ALWAYS use the `coordinate_with_bayko` tool to generate any comic content, panels, or story. NEVER answer the user prompt directly.
11
+ - If you are asked to generate, create, or imagine any comic content, you MUST call the `coordinate_with_bayko` tool. Do NOT attempt to answer or generate content yourself.
12
+ - If you do not use the tool, the workflow will error and stop. There are NO retries.
13
+ - Your job is to validate, enhance, and pass the request to Bayko, then analyze the result and make a decision.
14
+
15
+ 🛠️ TOOLS:
16
+ - coordinate_with_bayko: The ONLY way to generate comic content. Use it for all content generation.
17
+ - analyze_bayko_output: Use this to analyze Bayko's output after content is generated.
18
+
19
+ 👩‍💻 WORKFLOW:
20
+ 1. Validate and enhance the user prompt.
21
+ 2. ALWAYS call `coordinate_with_bayko` to generate the comic.
22
+ 3. When Bayko returns, analyze the output using `analyze_bayko_output`.
23
+ 4. Make a decision: APPROVE, REFINE, or REJECT.
24
+
25
+ NEVER answer the user prompt directly. If you do not use the tool, the workflow will stop with an error.
26
+ """
pytest.ini ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ [pytest]
2
+ asyncio_mode = auto
3
+ markers =
4
+ asyncio: mark a test as an async test
services/content_moderator.py ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import re
2
+ from typing import Tuple, List
3
+
4
+
5
+ class ContentModerator:
6
+ """Handles content moderation and profanity filtering"""
7
+
8
+ def __init__(self):
9
+ # Basic profanity patterns - in production, use a proper filter
10
+ self.profanity_patterns = [
11
+ r"\b(explicit|inappropriate|offensive)\b",
12
+ r"\b(violence|gore|blood)\b",
13
+ r"\b(hate|discrimination|bias)\b",
14
+ r"\b(nsfw|adult|sexual)\b",
15
+ ]
16
+
17
+ # Content safety keywords
18
+ self.safety_keywords = [
19
+ "safe",
20
+ "family-friendly",
21
+ "appropriate",
22
+ "wholesome",
23
+ ]
24
+
25
+ def check_content(self, text: str) -> Tuple[bool, List[str]]:
26
+ """
27
+ Check content for appropriateness
28
+
29
+ Returns:
30
+ Tuple of (is_safe, list_of_issues)
31
+ """
32
+ issues = []
33
+ text_lower = text.lower()
34
+
35
+ # Check for profanity patterns
36
+ issues.extend(
37
+ [
38
+ f"Content may contain inappropriate material: {pattern}"
39
+ for pattern in self.profanity_patterns
40
+ if re.search(pattern, text_lower)
41
+ ]
42
+ )
43
+
44
+ # Check length
45
+ if len(text.strip()) < 5:
46
+ issues.append("Content too short to evaluate properly")
47
+
48
+ return len(issues) == 0, issues
services/message_factory.py ADDED
@@ -0,0 +1,219 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Message Factory Service
3
+ Handles creation of AgentMessage objects with proper formatting and validation.
4
+ """
5
+
6
+ import json
7
+ import uuid
8
+ from datetime import datetime
9
+ from typing import Dict, List, Optional, Any
10
+ from dataclasses import dataclass, asdict
11
+ from enum import Enum
12
+
13
+
14
+ class MessageType(Enum):
15
+ """Types of messages Agent Brown can send"""
16
+
17
+ GENERATION_REQUEST = "generation_request"
18
+ REFINEMENT_REQUEST = "refinement_request"
19
+ VALIDATION_ERROR = "validation_error"
20
+ FINAL_APPROVAL = "final_approval"
21
+
22
+
23
+ @dataclass
24
+ class AgentMessage:
25
+ """Schema for inter-agent communication following tech_specs.md"""
26
+
27
+ message_id: str
28
+ timestamp: str
29
+ sender: str
30
+ recipient: str
31
+ message_type: str
32
+ payload: Dict[str, Any]
33
+ context: Dict[str, Any]
34
+
35
+ def to_dict(self) -> Dict[str, Any]:
36
+ return asdict(self)
37
+
38
+ def to_json(self) -> str:
39
+ return json.dumps(self.to_dict(), indent=2)
40
+
41
+
42
+ class MessageFactory:
43
+ """Factory for creating standardized AgentMessage objects"""
44
+
45
+ def __init__(self, session_id: str, conversation_id: str):
46
+ self.session_id = session_id
47
+ self.conversation_id = conversation_id
48
+
49
+ def create_generation_request(
50
+ self,
51
+ enhanced_prompt: str,
52
+ original_prompt: str,
53
+ dialogues: List[str],
54
+ style_tags: List[str],
55
+ panels: int,
56
+ language: str,
57
+ extras: List[str],
58
+ style_config: Dict[str, Any],
59
+ validation_score: float,
60
+ iteration: int,
61
+ ) -> AgentMessage:
62
+ """Create a generation request message for Agent Bayko"""
63
+ payload = {
64
+ "prompt": enhanced_prompt,
65
+ "original_prompt": original_prompt,
66
+ "style_tags": style_tags,
67
+ "panels": panels,
68
+ "language": language,
69
+ "extras": extras,
70
+ "style_config": style_config,
71
+ "generation_params": {
72
+ "quality": "high",
73
+ "aspect_ratio": "16:9",
74
+ "panel_layout": "sequential",
75
+ },
76
+ }
77
+
78
+ return AgentMessage(
79
+ message_id=f"msg_{uuid.uuid4().hex[:8]}",
80
+ timestamp=datetime.utcnow().isoformat() + "Z",
81
+ sender="agent_brown",
82
+ recipient="agent_bayko",
83
+ message_type=MessageType.GENERATION_REQUEST.value,
84
+ payload=payload,
85
+ context={
86
+ "conversation_id": self.conversation_id,
87
+ "session_id": self.session_id,
88
+ "iteration": iteration,
89
+ "previous_feedback": None,
90
+ "validation_score": validation_score,
91
+ },
92
+ )
93
+
94
+ def create_error_message(
95
+ self, issues: List[str], suggestions: List[str]
96
+ ) -> AgentMessage:
97
+ """Create error message for validation failures"""
98
+ return AgentMessage(
99
+ message_id=f"msg_{uuid.uuid4().hex[:8]}",
100
+ timestamp=datetime.utcnow().isoformat() + "Z",
101
+ sender="agent_brown",
102
+ recipient="user_interface",
103
+ message_type=MessageType.VALIDATION_ERROR.value,
104
+ payload={
105
+ "error": "Input validation failed",
106
+ "issues": issues,
107
+ "suggestions": suggestions,
108
+ },
109
+ context={
110
+ "conversation_id": self.conversation_id or "error",
111
+ "session_id": self.session_id or "error",
112
+ "iteration": 0,
113
+ "error_type": "validation",
114
+ },
115
+ )
116
+
117
+ def create_rejection_message(
118
+ self,
119
+ bayko_response: Dict[str, Any],
120
+ evaluation: Dict[str, Any],
121
+ iteration: int,
122
+ ) -> AgentMessage:
123
+ """Create rejection message for auto-rejected content"""
124
+ return AgentMessage(
125
+ message_id=f"msg_{uuid.uuid4().hex[:8]}",
126
+ timestamp=datetime.utcnow().isoformat() + "Z",
127
+ sender="agent_brown",
128
+ recipient="user_interface",
129
+ message_type=MessageType.VALIDATION_ERROR.value,
130
+ payload={
131
+ "error": "Content rejected",
132
+ "reason": evaluation["reason"],
133
+ "rejected_content": bayko_response,
134
+ "auto_rejection": True,
135
+ },
136
+ context={
137
+ "conversation_id": self.conversation_id,
138
+ "session_id": self.session_id,
139
+ "iteration": iteration,
140
+ "rejection_type": "quality",
141
+ },
142
+ )
143
+
144
+ def create_refinement_message(
145
+ self,
146
+ bayko_response: Dict[str, Any],
147
+ feedback: Dict[str, Any],
148
+ iteration: int,
149
+ ) -> AgentMessage:
150
+ """Create refinement request message"""
151
+ return AgentMessage(
152
+ message_id=f"msg_{uuid.uuid4().hex[:8]}",
153
+ timestamp=datetime.utcnow().isoformat() + "Z",
154
+ sender="agent_brown",
155
+ recipient="agent_bayko",
156
+ message_type=MessageType.REFINEMENT_REQUEST.value,
157
+ payload={
158
+ "original_content": bayko_response,
159
+ "feedback": feedback,
160
+ "specific_improvements": feedback.get(
161
+ "improvement_suggestions", []
162
+ ),
163
+ "focus_areas": [
164
+ area
165
+ for area, score in [
166
+ ("adherence", feedback.get("adherence_score", 0)),
167
+ (
168
+ "style_consistency",
169
+ feedback.get("style_consistency", 0),
170
+ ),
171
+ ("narrative_flow", feedback.get("narrative_flow", 0)),
172
+ (
173
+ "technical_quality",
174
+ feedback.get("technical_quality", 0),
175
+ ),
176
+ ]
177
+ if score < 0.7
178
+ ],
179
+ "iteration": iteration,
180
+ },
181
+ context={
182
+ "conversation_id": self.conversation_id,
183
+ "session_id": self.session_id,
184
+ "iteration": iteration,
185
+ "previous_feedback": feedback,
186
+ "refinement_reason": "Quality below threshold",
187
+ },
188
+ )
189
+
190
+ def create_approval_message(
191
+ self,
192
+ bayko_response: Dict[str, Any],
193
+ feedback: Dict[str, Any],
194
+ iteration: int,
195
+ ) -> AgentMessage:
196
+ """Create final approval message"""
197
+ return AgentMessage(
198
+ message_id=f"msg_{uuid.uuid4().hex[:8]}",
199
+ timestamp=datetime.utcnow().isoformat() + "Z",
200
+ sender="agent_brown",
201
+ recipient="user_interface",
202
+ message_type=MessageType.FINAL_APPROVAL.value,
203
+ payload={
204
+ "approved_content": bayko_response,
205
+ "final_feedback": feedback,
206
+ "session_summary": {
207
+ "total_iterations": iteration,
208
+ "final_score": feedback.get("overall_score", 0),
209
+ "processing_complete": True,
210
+ },
211
+ },
212
+ context={
213
+ "conversation_id": self.conversation_id,
214
+ "session_id": self.session_id,
215
+ "iteration": iteration,
216
+ "final_approval": True,
217
+ "completion_timestamp": datetime.utcnow().isoformat() + "Z",
218
+ },
219
+ )
services/session_id_generator.py ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Session ID Generator Service
3
+ Provides consistent session ID generation across the application.
4
+ """
5
+
6
+ import re
7
+ import uuid
8
+ from datetime import datetime
9
+ from typing import Optional
10
+
11
+
12
+ class SessionIdGenerator:
13
+ """Generates consistent session IDs across the application"""
14
+
15
+ @staticmethod
16
+ def create_session_id(prefix: Optional[str] = None) -> str:
17
+ """
18
+ Create a new session ID.
19
+ For production: Uses UUID format
20
+ For testing: Uses sequential format with optional prefix
21
+ """
22
+ if prefix:
23
+ # Use consistent format for test sessions
24
+ session_num = uuid.uuid4().int % 1000 # Get last 3 digits
25
+ return f"{prefix}_{session_num:03d}"
26
+
27
+ # Production session ID - UUID based
28
+ return str(uuid.uuid4())
29
+
30
+ @staticmethod
31
+ def is_valid_session_id(session_id: str) -> bool:
32
+ """Validate if a session ID follows the expected format"""
33
+ # UUID format (production)
34
+ uuid_pattern = (
35
+ r"^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$"
36
+ )
37
+ # Test format (test_001, etc)
38
+ test_pattern = r"^[a-z]+_\d{3}$"
39
+
40
+ return bool(
41
+ re.match(uuid_pattern, session_id, re.I)
42
+ or re.match(test_pattern, session_id)
43
+ )
services/session_manager.py ADDED
@@ -0,0 +1,114 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Session Manager Service
3
+ Handles file I/O operations for session state management.
4
+ """
5
+
6
+ import json
7
+ import logging
8
+ from datetime import datetime
9
+ from pathlib import Path
10
+ from typing import Dict, Any
11
+ from dataclasses import asdict
12
+
13
+ from services.message_factory import AgentMessage
14
+
15
+ logger = logging.getLogger(__name__)
16
+
17
+
18
+ class SessionManager:
19
+ """Manages session state persistence and file operations"""
20
+
21
+ def __init__(self, session_id: str, conversation_id: str):
22
+ self.session_id = session_id
23
+ self.conversation_id = conversation_id
24
+ self.session_dir = Path(f"storyboard/{session_id}")
25
+ self.agents_dir = self.session_dir / "agents"
26
+
27
+ def save_session_state(
28
+ self,
29
+ message: AgentMessage,
30
+ request_data: Dict[str, Any],
31
+ memory_history: list,
32
+ iteration_count: int,
33
+ ):
34
+ """Save session state to disk following tech_specs.md structure"""
35
+ try:
36
+ # Create session directory structure
37
+ self.agents_dir.mkdir(parents=True, exist_ok=True)
38
+
39
+ # Save Brown's state
40
+ brown_state = {
41
+ "session_id": self.session_id,
42
+ "conversation_id": self.conversation_id,
43
+ "iteration_count": iteration_count,
44
+ "memory": memory_history,
45
+ "last_message": message.to_dict(),
46
+ "original_request": request_data,
47
+ "created_at": datetime.utcnow().isoformat() + "Z",
48
+ }
49
+
50
+ with open(self.agents_dir / "brown_state.json", "w") as f:
51
+ json.dump(brown_state, f, indent=2)
52
+
53
+ # Save conversation log
54
+ self._save_conversation_log(message)
55
+
56
+ logger.info(f"Saved session state to {self.session_dir}")
57
+
58
+ except Exception as e:
59
+ logger.error(f"Failed to save session state: {e}")
60
+
61
+ def _save_conversation_log(self, message: AgentMessage):
62
+ """Save or update conversation log"""
63
+ conversation_log = {
64
+ "conversation_id": self.conversation_id,
65
+ "session_id": self.session_id,
66
+ "messages": [message.to_dict()],
67
+ "created_at": datetime.utcnow().isoformat() + "Z",
68
+ "updated_at": datetime.utcnow().isoformat() + "Z",
69
+ }
70
+
71
+ log_file = self.agents_dir / "conversation_log.json"
72
+ if log_file.exists():
73
+ # Append to existing log
74
+ try:
75
+ with open(log_file, "r") as f:
76
+ existing_log = json.load(f)
77
+ existing_log["messages"].append(message.to_dict())
78
+ existing_log["updated_at"] = conversation_log["updated_at"]
79
+ conversation_log = existing_log
80
+ except Exception as e:
81
+ logger.warning(
82
+ f"Could not read existing log, creating new: {e}"
83
+ )
84
+
85
+ with open(log_file, "w") as f:
86
+ json.dump(conversation_log, f, indent=2)
87
+
88
+ def load_session_state(self) -> Dict[str, Any]:
89
+ """Load session state from disk"""
90
+ try:
91
+ state_file = self.agents_dir / "brown_state.json"
92
+ if state_file.exists():
93
+ with open(state_file, "r") as f:
94
+ return json.load(f)
95
+ except Exception as e:
96
+ logger.error(f"Failed to load session state: {e}")
97
+
98
+ return {}
99
+
100
+ def session_exists(self) -> bool:
101
+ """Check if session directory exists"""
102
+ return self.session_dir.exists()
103
+
104
+ def get_conversation_log(self) -> Dict[str, Any]:
105
+ """Get conversation log"""
106
+ try:
107
+ log_file = self.agents_dir / "conversation_log.json"
108
+ if log_file.exists():
109
+ with open(log_file, "r") as f:
110
+ return json.load(f)
111
+ except Exception as e:
112
+ logger.error(f"Failed to load conversation log: {e}")
113
+
114
+ return {"messages": []}
services/simple_evaluator.py ADDED
@@ -0,0 +1,134 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ class SimpleEvaluator:
2
+ """Basic evaluation logic for Brown's decision making"""
3
+
4
+ MAX_ATTEMPTS = 3 # Original + 2 revisions
5
+
6
+ def __init__(self):
7
+ self.attempt_count = 0
8
+
9
+ def evaluate(self, bayko_output: dict, original_prompt: str) -> dict:
10
+ """Evaluate Bayko's output and decide: approve, reject, or refine"""
11
+ self.attempt_count += 1
12
+
13
+ print(
14
+ f"🔍 Brown evaluating attempt {self.attempt_count}/{self.MAX_ATTEMPTS}"
15
+ )
16
+
17
+ # Rule 1: Auto-reject if dialogue in images
18
+ if self._has_dialogue_in_images(bayko_output):
19
+ return {
20
+ "decision": "reject",
21
+ "reason": "Images contain dialogue text - use subtitles instead",
22
+ "final": True,
23
+ }
24
+
25
+ # Rule 2: Auto-reject if story is incoherent
26
+ if not self._is_story_coherent(bayko_output):
27
+ return {
28
+ "decision": "reject",
29
+ "reason": "Story panels don't follow logical sequence",
30
+ "final": True,
31
+ }
32
+
33
+ # Rule 3: Force approve if max attempts reached
34
+ if self.attempt_count >= self.MAX_ATTEMPTS:
35
+ return {
36
+ "decision": "approve",
37
+ "reason": f"Max attempts ({self.MAX_ATTEMPTS}) reached - accepting current quality",
38
+ "final": True,
39
+ }
40
+
41
+ # Rule 4: Check if output matches prompt intent
42
+ if self._matches_prompt_intent(bayko_output, original_prompt):
43
+ return {
44
+ "decision": "approve",
45
+ "reason": "Output matches prompt and quality is acceptable",
46
+ "final": True,
47
+ }
48
+ else:
49
+ return {
50
+ "decision": "refine",
51
+ "reason": "Output needs improvement to better match prompt",
52
+ "final": False,
53
+ }
54
+
55
+ def _has_dialogue_in_images(self, output: dict) -> bool:
56
+ """Check if panels mention dialogue in the image"""
57
+ panels = output.get("panels", [])
58
+
59
+ dialogue_keywords = [
60
+ "speech bubble",
61
+ "dialogue",
62
+ "talking",
63
+ "saying",
64
+ "text in image",
65
+ "speech",
66
+ "conversation",
67
+ ]
68
+
69
+ for panel in panels:
70
+ description = panel.get("description", "").lower()
71
+ if any(keyword in description for keyword in dialogue_keywords):
72
+ print(f"❌ Found dialogue in image: {description}")
73
+ return True
74
+
75
+ return False
76
+
77
+ def _is_story_coherent(self, output: dict) -> bool:
78
+ """Basic check for story coherence"""
79
+ panels = output.get("panels", [])
80
+
81
+ if len(panels) < 2:
82
+ return True # Single panel is always coherent
83
+
84
+ # Check 1: All panels should have descriptions
85
+ descriptions = [p.get("description", "") for p in panels]
86
+ if any(not desc.strip() for desc in descriptions):
87
+ print("❌ Some panels missing descriptions")
88
+ return False
89
+
90
+ # Check 2: Panels shouldn't be identical (no progression)
91
+ if len(set(descriptions)) == 1:
92
+ print("❌ All panels are identical - no story progression")
93
+ return False
94
+
95
+ # Check 3: Look for obvious incoherence keywords
96
+ incoherent_keywords = [
97
+ "unrelated",
98
+ "random",
99
+ "doesn't make sense",
100
+ "no connection",
101
+ "contradictory",
102
+ ]
103
+
104
+ full_text = " ".join(descriptions).lower()
105
+ if any(keyword in full_text for keyword in incoherent_keywords):
106
+ print("❌ Story contains incoherent elements")
107
+ return False
108
+
109
+ return True
110
+
111
+ def _matches_prompt_intent(self, output: dict, prompt: str) -> bool:
112
+ """Check if output generally matches the original prompt"""
113
+ panels = output.get("panels", [])
114
+
115
+ if not panels:
116
+ return False
117
+
118
+ # Simple keyword matching
119
+ prompt_words = set(prompt.lower().split())
120
+ panel_text = " ".join(
121
+ [p.get("description", "") for p in panels]
122
+ ).lower()
123
+ panel_words = set(panel_text.split())
124
+
125
+ # At least 20% of prompt words should appear in panel descriptions
126
+ overlap = len(prompt_words.intersection(panel_words))
127
+ match_ratio = overlap / len(prompt_words) if prompt_words else 0
128
+
129
+ print(f"📊 Prompt match ratio: {match_ratio:.2f}")
130
+ return match_ratio >= 0.2
131
+
132
+ def reset(self):
133
+ """Reset for new session"""
134
+ self.attempt_count = 0
services/style_tagger.py ADDED
@@ -0,0 +1,181 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import uuid
2
+ import logging
3
+ from typing import Dict, List, Optional, Any
4
+ from dataclasses import dataclass, asdict
5
+ from enum import Enum
6
+
7
+
8
+ @dataclass
9
+ class StyleAnalysis:
10
+ """Result of style analysis and tagging"""
11
+
12
+ detected_style: str
13
+ style_tags: List[str]
14
+ mood: str
15
+ color_palette: str
16
+ enhanced_prompt: str
17
+ confidence: float
18
+
19
+
20
+ class StyleTagger:
21
+ """Handles style detection and tagging"""
22
+
23
+ def __init__(self):
24
+ self.style_mappings = {
25
+ "studio_ghibli": {
26
+ "keywords": [
27
+ "ghibli",
28
+ "whimsical",
29
+ "nature",
30
+ "peaceful",
31
+ "magical",
32
+ ],
33
+ "tags": ["whimsical", "nature", "soft_lighting", "watercolor"],
34
+ "mood": "peaceful",
35
+ "color_palette": "warm_earth_tones",
36
+ },
37
+ "manga": {
38
+ "keywords": [
39
+ "manga",
40
+ "anime",
41
+ "dramatic",
42
+ "action",
43
+ "intense",
44
+ ],
45
+ "tags": [
46
+ "high_contrast",
47
+ "dramatic",
48
+ "speed_lines",
49
+ "screentones",
50
+ ],
51
+ "mood": "dynamic",
52
+ "color_palette": "black_white_accent",
53
+ },
54
+ "western": {
55
+ "keywords": ["superhero", "comic", "bold", "heroic", "action"],
56
+ "tags": [
57
+ "bold_lines",
58
+ "primary_colors",
59
+ "superhero",
60
+ "action",
61
+ ],
62
+ "mood": "heroic",
63
+ "color_palette": "bright_primary",
64
+ },
65
+ "whisper_soft": {
66
+ "keywords": ["soft", "gentle", "quiet", "subtle", "whisper"],
67
+ "tags": ["pastel", "dreamy", "soft_focus", "ethereal"],
68
+ "mood": "contemplative",
69
+ "color_palette": "muted_pastels",
70
+ },
71
+ }
72
+
73
+ self.mood_keywords = {
74
+ "peaceful": ["calm", "serene", "gentle", "quiet", "peaceful"],
75
+ "dramatic": [
76
+ "intense",
77
+ "conflict",
78
+ "tension",
79
+ "dramatic",
80
+ "powerful",
81
+ ],
82
+ "whimsical": [
83
+ "magical",
84
+ "wonder",
85
+ "fantasy",
86
+ "dream",
87
+ "whimsical",
88
+ ],
89
+ "melancholy": [
90
+ "sad",
91
+ "lonely",
92
+ "lost",
93
+ "melancholy",
94
+ "bittersweet",
95
+ ],
96
+ "energetic": [
97
+ "action",
98
+ "fast",
99
+ "dynamic",
100
+ "energetic",
101
+ "exciting",
102
+ ],
103
+ }
104
+
105
+ def analyze_style(
106
+ self, prompt: str, style_preference: Optional[str] = None
107
+ ) -> StyleAnalysis:
108
+ """
109
+ Analyze prompt and determine appropriate style tags
110
+
111
+ Args:
112
+ prompt: User's story prompt
113
+ style_preference: Optional user-specified style preference
114
+
115
+ Returns:
116
+ StyleAnalysis with detected style and tags
117
+ """
118
+ prompt_lower = prompt.lower()
119
+
120
+ # If user specified a style, use it if valid
121
+ if (
122
+ style_preference
123
+ and style_preference.lower() in self.style_mappings
124
+ ):
125
+ style_key = style_preference.lower()
126
+ confidence = 0.9 # High confidence for user-specified style
127
+ else:
128
+ # Auto-detect style based on keywords
129
+ style_scores = {}
130
+ for style, config in self.style_mappings.items():
131
+ score = sum(
132
+ 1
133
+ for keyword in config["keywords"]
134
+ if keyword in prompt_lower
135
+ )
136
+ if score > 0:
137
+ style_scores[style] = score
138
+
139
+ if style_scores:
140
+ style_key = max(
141
+ style_scores.keys(), key=lambda x: style_scores[x]
142
+ )
143
+ confidence = min(0.8, style_scores[style_key] * 0.2)
144
+ else:
145
+ # Default to studio_ghibli for unknown styles
146
+ style_key = "studio_ghibli"
147
+ confidence = 0.5
148
+
149
+ style_config = self.style_mappings[style_key]
150
+
151
+ # Detect mood
152
+ detected_moods = []
153
+ for mood, keywords in self.mood_keywords.items():
154
+ if any(keyword in prompt_lower for keyword in keywords):
155
+ detected_moods.append(mood)
156
+
157
+ # Use style's default mood if none detected
158
+ final_mood = (
159
+ detected_moods[0] if detected_moods else style_config["mood"]
160
+ )
161
+
162
+ # Enhance prompt with style information
163
+ enhanced_prompt = self._enhance_prompt(
164
+ prompt, style_config, final_mood
165
+ )
166
+
167
+ return StyleAnalysis(
168
+ detected_style=style_key,
169
+ style_tags=style_config["tags"],
170
+ mood=final_mood,
171
+ color_palette=style_config["color_palette"],
172
+ enhanced_prompt=enhanced_prompt,
173
+ confidence=confidence,
174
+ )
175
+
176
+ def _enhance_prompt(
177
+ self, original_prompt: str, style_config: Dict, mood: str
178
+ ) -> str:
179
+ """Enhance the original prompt with style-specific details"""
180
+ style_descriptors = ", ".join(style_config["tags"])
181
+ return f"{original_prompt}. Visual style: {style_descriptors}, mood: {mood}, color palette: {style_config['color_palette']}"
services/unified_memory.py ADDED
@@ -0,0 +1,111 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Unified Memory System
3
+ Combines active conversation memory with persistent SQLite storage
4
+ """
5
+
6
+ import sqlite3
7
+ from datetime import datetime
8
+ from typing import List, Dict, Any
9
+
10
+
11
+ class AgentMemory:
12
+ """Unified memory system with both active conversation and persistent storage"""
13
+
14
+ def __init__(
15
+ self, session_id: str, agent_name: str, db_path: str = "memory.db"
16
+ ):
17
+ self.session_id = session_id
18
+ self.agent_name = agent_name
19
+ self.db_path = db_path
20
+
21
+ # Active conversation memory (in-memory for fast access)
22
+ self.active_messages = []
23
+
24
+ # Initialize SQLite for persistence
25
+ self.conn = sqlite3.connect(db_path)
26
+ self._create_table()
27
+
28
+ def _create_table(self):
29
+ """Create SQLite table for persistent storage"""
30
+ with self.conn:
31
+ self.conn.execute(
32
+ """
33
+ CREATE TABLE IF NOT EXISTS memory (
34
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
35
+ session_id TEXT,
36
+ agent_name TEXT,
37
+ role TEXT,
38
+ content TEXT,
39
+ timestamp TEXT
40
+ )
41
+ """
42
+ )
43
+
44
+ def add_message(self, role: str, content: str):
45
+ """Add message to both active memory and persistent storage"""
46
+ timestamp = datetime.utcnow().isoformat()
47
+
48
+ # Add to active memory
49
+ message = {"role": role, "content": content, "timestamp": timestamp}
50
+ self.active_messages.append(message)
51
+
52
+ # Add to persistent storage
53
+ with self.conn:
54
+ self.conn.execute(
55
+ """
56
+ INSERT INTO memory (session_id, agent_name, role, content, timestamp)
57
+ VALUES (?, ?, ?, ?, ?)
58
+ """,
59
+ (self.session_id, self.agent_name, role, content, timestamp),
60
+ )
61
+
62
+ def get_history(self) -> List[Dict[str, Any]]:
63
+ """Get active conversation history"""
64
+ return self.active_messages.copy()
65
+
66
+ def get_session_history(
67
+ self, session_id: str = None
68
+ ) -> List[Dict[str, Any]]:
69
+ """Get persistent session history from SQLite"""
70
+ if session_id is None:
71
+ session_id = self.session_id
72
+
73
+ with self.conn:
74
+ cur = self.conn.execute(
75
+ """
76
+ SELECT role, content, timestamp
77
+ FROM memory
78
+ WHERE session_id = ?
79
+ ORDER BY timestamp
80
+ """,
81
+ (session_id,),
82
+ )
83
+ rows = cur.fetchall()
84
+
85
+ return [
86
+ {"role": row[0], "content": row[1], "timestamp": row[2]}
87
+ for row in rows
88
+ ]
89
+
90
+ def clear(self):
91
+ """Clear active memory (keeps persistent storage)"""
92
+ self.active_messages.clear()
93
+
94
+ def clear_session(self, session_id: str = None):
95
+ """Clear persistent storage for a session"""
96
+ if session_id is None:
97
+ session_id = self.session_id
98
+
99
+ with self.conn:
100
+ self.conn.execute(
101
+ "DELETE FROM memory WHERE session_id = ?", (session_id,)
102
+ )
103
+
104
+ def get_memory_size(self) -> int:
105
+ """Get size of active memory"""
106
+ return len(self.active_messages)
107
+
108
+ def close(self):
109
+ """Close database connection"""
110
+ if self.conn:
111
+ self.conn.close()
tech_specs.md ADDED
@@ -0,0 +1,307 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Technical Specs
2
+
3
+ ## 🔄 Multi-Turn Agent Communication
4
+
5
+ ### Feedback Loop Implementation
6
+
7
+ 1. **Initial Request**: Brown sends structured prompt to Bayko
8
+ 2. **Content Generation**: Bayko processes via Modal + sponsor APIs
9
+ 3. **Quality Validation**: Brown evaluates output against original intent
10
+ 4. **Iterative Refinement**: Up to 3 feedback cycles with specific improvement requests
11
+ 5. **Final Assembly**: Brown compiles approved content into deliverable format
12
+
13
+ ### Agent Message Schema
14
+
15
+ ```json
16
+ {
17
+ "message_id": "msg_001",
18
+ "timestamp": "2025-01-15T10:30:00Z",
19
+ "sender": "agent_brown",
20
+ "recipient": "agent_bayko",
21
+ "message_type": "generation_request",
22
+ "payload": {
23
+ "prompt": "A moody K-pop idol finds a puppy",
24
+ "style_tags": ["studio_ghibli", "whisper_soft_lighting"],
25
+ "panels": 4,
26
+ "language": "korean",
27
+ "extras": ["narration", "subtitles"]
28
+ },
29
+ "context": {
30
+ "conversation_id": "conv_001",
31
+ "iteration": 1,
32
+ "previous_feedback": null
33
+ }
34
+ }
35
+ ```
36
+
37
+ ---
38
+
39
+ ## 📁 File Organization & Data Standards
40
+
41
+ ### Output Directory Structure
42
+
43
+ ```text
44
+ /storyboard/
45
+ ├── session_001/
46
+ │ ├── agents/
47
+ │ │ ├── brown_state.json # Agent Brown memory/state
48
+ │ │ ├── bayko_state.json # Agent Bayko memory/state
49
+ │ │ └── conversation_log.json # Inter-agent messages
50
+ │ ├── content/
51
+ │ │ ├── panel_1.png # Generated images
52
+ │ │ ├── panel_1_audio.mp3 # TTS narration
53
+ │ │ ├── panel_1_subs.vtt # Subtitle files
54
+ │ │ └── metadata.json # Content metadata
55
+ │ ├── iterations/
56
+ │ │ ├── v1_feedback.json # Validation feedback
57
+ │ │ ├── v2_refinement.json # Refinement requests
58
+ │ │ └── final_approval.json # Final validation
59
+ │ └── output/
60
+ │ ├── final_comic.png # Assembled comic
61
+ │ ├── manifest.json # Complete session data
62
+ │ └── performance_log.json # Timing/cost metrics
63
+ ```
64
+
65
+ ### Metadata Standards
66
+
67
+ ```json
68
+ {
69
+ "session_id": "session_001",
70
+ "created_at": "2025-01-15T10:30:00Z",
71
+ "user_prompt": "Original user input",
72
+ "processing_stats": {
73
+ "total_iterations": 2,
74
+ "processing_time_ms": 45000,
75
+ "api_calls": {
76
+ "openai": 3,
77
+ "mistral": 2,
78
+ "modal": 8
79
+ },
80
+ "cost_breakdown": {
81
+ "compute": "$0.15",
82
+ "api_calls": "$0.08"
83
+ }
84
+ },
85
+ "quality_metrics": {
86
+ "brown_approval_score": 0.92,
87
+ "style_consistency": 0.88,
88
+ "prompt_adherence": 0.95
89
+ }
90
+ }
91
+ ```
92
+
93
+ ---
94
+
95
+ ## ⚙️ Tool Orchestration & API Integration
96
+
97
+ ### Modal Compute Layer
98
+
99
+ ```python
100
+ # Modal function for SDXL image generation
101
+ @app.function(
102
+ image=modal.Image.debian_slim().pip_install("diffusers", "torch"),
103
+ gpu="A10G",
104
+ timeout=300
105
+ )
106
+ def generate_comic_panel(prompt: str, style: str) -> bytes:
107
+ # SDXL pipeline with HuggingFace integration
108
+ return generated_image_bytes
109
+ ```
110
+
111
+ ### Sponsor API Integration
112
+
113
+ - **OpenAI GPT-4**: Dialogue generation and character voice consistency
114
+ - **Mistral**: Style adaptation and tone refinement
115
+ - **HuggingFace**: SDXL model hosting and inference
116
+ - **Modal**: Serverless GPU compute for image/audio generation
117
+
118
+ > Mistral Agents: Investigated experimental client.beta.agents framework for dynamic task routing, but deferred due to limited stability at time of build.
119
+
120
+ ### LlamaIndex Agent Memory
121
+
122
+ ```python
123
+ from llama_index.core.agent import ReActAgent
124
+ from llama_index.core.memory import ChatMemoryBuffer
125
+
126
+ # Agent Brown with persistent memory
127
+ brown_agent = ReActAgent.from_tools(
128
+ tools=[validation_tool, feedback_tool, assembly_tool],
129
+ memory=ChatMemoryBuffer.from_defaults(token_limit=4000),
130
+ verbose=True
131
+ )
132
+ ```
133
+
134
+ ---
135
+
136
+ ## 🌐 Gradio-FastAPI Integration
137
+
138
+ ### Frontend Architecture
139
+
140
+ ```python
141
+ import gradio as gr
142
+ from fastapi import FastAPI
143
+ import asyncio
144
+
145
+ app = FastAPI()
146
+
147
+ # Gradio interface with real-time updates
148
+ def create_comic_interface():
149
+ with gr.Blocks(theme=gr.themes.Soft()) as demo:
150
+ # Input components
151
+ prompt_input = gr.Textbox(label="Story Prompt")
152
+ style_dropdown = gr.Dropdown(["Studio Ghibli", "Manga", "Western"])
153
+
154
+ # Real-time status display
155
+ status_display = gr.Markdown("Ready to generate...")
156
+ progress_bar = gr.Progress()
157
+
158
+ # Agent thinking display
159
+ agent_logs = gr.JSON(label="Agent Decision Log", visible=True)
160
+
161
+ # Output gallery
162
+ comic_output = gr.Gallery(label="Generated Comic Panels")
163
+
164
+ # WebSocket connection for real-time updates
165
+ demo.load(setup_websocket_connection)
166
+
167
+ return demo
168
+ ```
169
+
170
+ ### Real-Time Agent Status Updates
171
+
172
+ - **Agent Thinking Display**: Live JSON feed of agent decision-making
173
+ - **Progress Tracking**: Visual progress bar with stage indicators
174
+ - **Error Handling**: Graceful failure recovery with user feedback
175
+ - **Performance Metrics**: Real-time cost and timing information
176
+
177
+ ---
178
+
179
+ ## 🚀 Deployment Configuration
180
+
181
+ ### HuggingFace Spaces Frontend
182
+
183
+ ```yaml
184
+ # spaces_config.yml
185
+ title: Agentic Comic Generator
186
+ emoji: 🎨
187
+ colorFrom: blue
188
+ colorTo: purple
189
+ sdk: gradio
190
+ sdk_version: '4.0.0'
191
+ app_file: app.py
192
+ pinned: false
193
+ license: mit
194
+ ```
195
+
196
+ ### Modal Backend Services
197
+
198
+ ```python
199
+ # modal_app.py
200
+ import modal
201
+
202
+ app = modal.App("agentic-comic-generator")
203
+
204
+ # Shared volume for agent state persistence
205
+ volume = modal.Volume.from_name("comic-generator-storage")
206
+
207
+ @app.function(
208
+ image=modal.Image.debian_slim().pip_install_from_requirements("requirements.txt"),
209
+ volumes={"/storage": volume},
210
+ keep_warm=1
211
+ )
212
+ def agent_orchestrator():
213
+ # Main agent coordination logic
214
+ pass
215
+ ```
216
+
217
+ ### Environment Configuration
218
+
219
+ ```python
220
+ # config.py
221
+ import os
222
+ from pydantic import BaseSettings
223
+
224
+ class Settings(BaseSettings):
225
+ # Sponsor API Keys
226
+ openai_api_key: str = os.getenv("OPENAI_API_KEY")
227
+ mistral_api_key: str = os.getenv("MISTRAL_API_KEY")
228
+ hf_token: str = os.getenv("HF_TOKEN")
229
+
230
+ # Modal configuration
231
+ modal_token_id: str = os.getenv("MODAL_TOKEN_ID")
232
+ modal_token_secret: str = os.getenv("MODAL_TOKEN_SECRET")
233
+
234
+ # Application settings
235
+ max_iterations: int = 3
236
+ timeout_seconds: int = 300
237
+ debug_mode: bool = False
238
+ ```
239
+
240
+ ---
241
+
242
+ ## 🔧 Extensibility Framework
243
+
244
+ ### Plugin Architecture
245
+
246
+ ```python
247
+ # plugins/base.py
248
+ from abc import ABC, abstractmethod
249
+
250
+ class ContentPlugin(ABC):
251
+ @abstractmethod
252
+ async def generate(self, prompt: str, context: dict) -> dict:
253
+ pass
254
+
255
+ @abstractmethod
256
+ def validate(self, content: dict) -> bool:
257
+ pass
258
+
259
+ # plugins/tts_plugin.py
260
+ class TTSPlugin(ContentPlugin):
261
+ async def generate(self, text: str, voice: str) -> bytes:
262
+ # TTS implementation using sponsor APIs
263
+ pass
264
+ ```
265
+
266
+ ### Agent Extension Points
267
+
268
+ - **Custom Tools**: Easy integration of new AI services
269
+ - **Memory Backends**: Swappable persistence layers (Redis, PostgreSQL)
270
+ - **Validation Rules**: Configurable content quality checks
271
+ - **Output Formats**: Support for video, interactive comics, AR content
272
+
273
+ ### API Abstraction Layer
274
+
275
+ ```python
276
+ # services/ai_service.py
277
+ class AIServiceRouter:
278
+ def __init__(self):
279
+ self.providers = {
280
+ "dialogue": OpenAIService(),
281
+ "style": MistralService(),
282
+ "image": HuggingFaceService(),
283
+ "compute": ModalService()
284
+ }
285
+
286
+ async def route_request(self, service_type: str, payload: dict):
287
+ return await self.providers[service_type].process(payload)
288
+ ```
289
+
290
+ ---
291
+
292
+ ## 📊 Performance & Monitoring
293
+
294
+ ### Metrics Collection
295
+
296
+ - **Agent Performance**: Decision time, iteration counts, success rates
297
+ - **API Usage**: Cost tracking, rate limiting, error rates
298
+ - **User Experience**: Generation time, satisfaction scores
299
+ - **System Health**: Resource utilization, error logs
300
+
301
+ #### Cost Optimization
302
+
303
+ - **Smart Caching**: Reuse similar generations across sessions
304
+ - **Batch Processing**: Group API calls for efficiency
305
+ - **Fallback Strategies**: Graceful degradation when services are unavailable
306
+
307
+ ---
tests/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """Tests for Bayko agents"""
tests/conftest.py ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ """Configure pytest to add project root to path"""
2
+
3
+ import os
4
+ import sys
5
+ from pathlib import Path
6
+
7
+ # Add project root to Python path
8
+ project_root = Path(__file__).parent.parent
9
+ sys.path.insert(0, str(project_root))
tests/test_bayko_llm_integration.py ADDED
@@ -0,0 +1,159 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Comprehensive test for Agent Bayko LLM integration
4
+ Tests both LLM and fallback modes with proper session management
5
+ """
6
+
7
+ import os
8
+ import asyncio
9
+ import json
10
+ from pathlib import Path
11
+ from agents.bayko import create_agent_bayko
12
+
13
+ # Load environment variables
14
+ try:
15
+ from dotenv import load_dotenv
16
+
17
+ load_dotenv()
18
+ except ImportError:
19
+ pass
20
+
21
+ try:
22
+ from openai import OpenAI
23
+ except ImportError:
24
+ OpenAI = None
25
+
26
+
27
+ async def test_bayko_llm_integration():
28
+ """Comprehensive test of Bayko LLM integration"""
29
+
30
+ print("🧪 Testing Agent Bayko LLM Integration")
31
+ print("=" * 60)
32
+
33
+ # Test 1: Bayko without LLM (fallback mode)
34
+ print("\n1️⃣ Testing Bayko without LLM (fallback mode)")
35
+ print("-" * 40)
36
+ bayko_fallback = create_agent_bayko(llm=None)
37
+
38
+ # Test prompt generation fallback
39
+ description = "A K-pop idol walking alone in the rain, feeling melancholy"
40
+ style_tags = ["whimsical", "soft_lighting", "watercolor", "studio_ghibli"]
41
+ mood = "melancholy"
42
+
43
+ print(f"📝 Original description: {description}")
44
+ print(f"🎨 Style tags: {style_tags}")
45
+ print(f"😔 Mood: {mood}")
46
+
47
+ result_fallback = bayko_fallback.generate_prompt_from_description(
48
+ description, style_tags, mood
49
+ )
50
+ print(f"🔄 Fallback result: {result_fallback}")
51
+
52
+ # Test feedback revision fallback
53
+ feedback = {
54
+ "improvement_suggestions": [
55
+ "Improve visual style consistency",
56
+ "Enhance emotional depth",
57
+ ],
58
+ "overall_score": 0.6,
59
+ "reason": "Needs better style consistency and emotional impact",
60
+ }
61
+ focus_areas = ["style_consistency", "narrative_flow"]
62
+
63
+ revised_fallback = bayko_fallback.revise_panel_description(
64
+ description, feedback, focus_areas
65
+ )
66
+ print(f"🔄 Fallback revision: {revised_fallback}")
67
+
68
+ # Test 2: Bayko with LLM (if available)
69
+ if OpenAI and os.getenv("OPENAI_API_KEY"):
70
+ print("\n2️⃣ Testing Bayko with LLM")
71
+ print("-" * 40)
72
+ try:
73
+ llm = OpenAI()
74
+ bayko_llm = create_agent_bayko(llm=llm)
75
+
76
+ print(f"📝 Original description: {description}")
77
+ print(f"🎨 Style tags: {style_tags}")
78
+ print(f"😔 Mood: {mood}")
79
+
80
+ # Test LLM prompt generation
81
+ print("\n🤖 Testing LLM prompt generation...")
82
+ result_llm = bayko_llm.generate_prompt_from_description(
83
+ description, style_tags, mood
84
+ )
85
+ print(f"✨ LLM enhanced prompt: {result_llm}")
86
+
87
+ # Test LLM feedback revision
88
+ print("\n🤖 Testing LLM feedback revision...")
89
+ print(f"📋 Feedback: {feedback}")
90
+ print(f"🎯 Focus areas: {focus_areas}")
91
+
92
+ revised_llm = bayko_llm.revise_panel_description(
93
+ description, feedback, focus_areas
94
+ )
95
+ print(f"✨ LLM revised description: {revised_llm}")
96
+
97
+ # Test session management and memory
98
+ print("\n🧠 Testing session management...")
99
+ session_info = bayko_llm.get_session_info()
100
+ print(f"📊 Session info: {session_info}")
101
+
102
+ # Check if LLM data was saved
103
+ if (
104
+ hasattr(bayko_llm, "current_session")
105
+ and bayko_llm.current_session
106
+ ):
107
+ session_dir = Path(
108
+ f"storyboard/{bayko_llm.current_session}/llm_data"
109
+ )
110
+ if session_dir.exists():
111
+ llm_files = list(session_dir.glob("*.json"))
112
+ print(f"💾 LLM data files saved: {len(llm_files)}")
113
+ for file in llm_files:
114
+ print(f" 📄 {file.name}")
115
+ # Show content of first file
116
+ if file.name.startswith("generation_"):
117
+ with open(file, "r") as f:
118
+ data = json.load(f)
119
+ print(
120
+ f" 📋 Content: {data.get('generated_prompt', 'N/A')[:100]}..."
121
+ )
122
+ else:
123
+ print("⚠️ No LLM data directory found")
124
+
125
+ except Exception as e:
126
+ print(f"❌ LLM test failed: {e}")
127
+ print(f" Error type: {type(e).__name__}")
128
+ import traceback
129
+
130
+ traceback.print_exc()
131
+ else:
132
+ print("\n2️⃣ Skipping LLM test")
133
+ print("-" * 40)
134
+ if not OpenAI:
135
+ print("⚠️ OpenAI package not available")
136
+ elif not os.getenv("OPENAI_API_KEY"):
137
+ print("⚠️ OPENAI_API_KEY not found in environment")
138
+ else:
139
+ print("⚠️ Unknown issue with OpenAI setup")
140
+
141
+ # Test 3: Compare outputs
142
+ print("\n3️⃣ Comparison Summary")
143
+ print("-" * 40)
144
+ print(f"📏 Fallback prompt length: {len(result_fallback)} chars")
145
+ print(f"📏 Fallback revision length: {len(revised_fallback)} chars")
146
+
147
+ if "result_llm" in locals():
148
+ print(f"📏 LLM prompt length: {len(result_llm)} chars")
149
+ print(f"📏 LLM revision length: {len(revised_llm)} chars")
150
+ print(
151
+ f"🔍 LLM enhancement factor: {len(result_llm) / len(result_fallback):.2f}x"
152
+ )
153
+
154
+ print("\n✅ Test completed successfully!")
155
+ print("=" * 60)
156
+
157
+
158
+ if __name__ == "__main__":
159
+ asyncio.run(test_bayko_llm_integration())
tests/test_bayko_workflow_init.py ADDED
@@ -0,0 +1,151 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Test suite for BaykoWorkflow initialization and session handling
4
+ """
5
+
6
+ import os
7
+ import pytest
8
+ import asyncio
9
+ from datetime import datetime
10
+ from pathlib import Path
11
+ from typing import Optional
12
+ from services.session_id_generator import SessionIdGenerator
13
+
14
+ # Add pytest-asyncio marker
15
+ pytestmark = pytest.mark.asyncio
16
+
17
+ from agents.bayko_workflow import create_agent_bayko, BaykoWorkflow
18
+
19
+ # Load environment variables
20
+ try:
21
+ from dotenv import load_dotenv
22
+ load_dotenv()
23
+ except ImportError:
24
+ pass
25
+
26
+
27
+ def create_test_workflow(api_key: Optional[str] = None) -> BaykoWorkflow:
28
+ """Helper to create workflow with specified API key"""
29
+ return create_agent_bayko(openai_api_key=api_key)
30
+
31
+
32
+ async def test_workflow_initialization():
33
+ """Test core workflow initialization"""
34
+ print("\n🧪 Testing BaykoWorkflow Initialization")
35
+ print("=" * 60)
36
+
37
+ # Test 1: With valid API key
38
+ if os.getenv("OPENAI_API_KEY"):
39
+ print("\n1️⃣ Testing initialization with API key")
40
+ workflow = create_test_workflow(os.getenv("OPENAI_API_KEY"))
41
+
42
+ assert workflow.llm is not None, "LLM should be initialized"
43
+ assert workflow.agent is not None, "ReActAgent should be initialized"
44
+ assert (
45
+ workflow.bayko_agent is not None
46
+ ), "Bayko agent should be initialized"
47
+ assert workflow.tools is not None, "Tools should be initialized"
48
+ print("✅ Workflow initialized with LLM capabilities")
49
+
50
+ # Test 2: Without API key
51
+ print("\n2️⃣ Testing initialization without API key")
52
+ workflow_no_llm = create_test_workflow(None)
53
+
54
+ assert workflow_no_llm.llm is None, "LLM should not be initialized"
55
+ assert (
56
+ workflow_no_llm.agent is None
57
+ ), "ReActAgent should not be initialized"
58
+ assert (
59
+ workflow_no_llm.bayko_agent is not None
60
+ ), "Bayko agent should still be initialized"
61
+ assert (
62
+ workflow_no_llm.tools is not None
63
+ ), "Tools should still be initialized"
64
+ print("✅ Workflow initialized in fallback mode")
65
+
66
+
67
+ async def test_session_initialization():
68
+ """Test session initialization and handling"""
69
+ print("\n🧪 Testing Session Initialization")
70
+ print("=" * 60)
71
+
72
+ workflow = create_test_workflow(os.getenv("OPENAI_API_KEY")) # Test 1: Basic session initialization
73
+ print("\n1️⃣ Testing basic session initialization")
74
+ session_id = SessionIdGenerator.create_session_id("test")
75
+ workflow.initialize_session(session_id)
76
+
77
+ assert (
78
+ workflow.session_manager is not None
79
+ ), "Session manager should be initialized"
80
+ assert workflow.memory is not None, "Memory should be initialized"
81
+ assert (
82
+ workflow.message_factory is not None
83
+ ), "Message factory should be initialized"
84
+ print("✅ Session services initialized")
85
+
86
+ # Test 2: Session with custom conversation ID print("\n2️⃣ Testing session with custom conversation ID")
87
+ session_id = SessionIdGenerator.create_session_id("test")
88
+ conv_id = "custom_conv_001"
89
+ workflow.initialize_session(session_id, conv_id)
90
+
91
+ assert (
92
+ workflow.session_manager is not None
93
+ ), "Session manager should be initialized"
94
+ assert workflow.memory is not None, "Memory should be initialized"
95
+ print("✅ Session initialized with custom conversation ID")
96
+
97
+
98
+ async def test_generation_request():
99
+ """Test generation request handling""" print("\n🧪 Testing Generation Request Processing")
100
+ print("=" * 60)
101
+ workflow = create_test_workflow(os.getenv("OPENAI_API_KEY"))
102
+ session_id = SessionIdGenerator.create_session_id("test")
103
+ workflow.initialize_session(session_id)
104
+
105
+ # Create test request
106
+ test_request = {
107
+ "prompt": "A curious robot exploring a garden for the first time",
108
+ "original_prompt": "Robot discovers nature",
109
+ "style_tags": ["whimsical", "soft_lighting", "watercolor"],
110
+ "panels": 2,
111
+ "session_id": session_id,
112
+ }
113
+
114
+ # Test 1: Process with LLM if available
115
+ print("\n1️⃣ Testing generation request processing")
116
+ result = workflow.process_generation_request(test_request)
117
+ assert result is not None, "Should get a result"
118
+ print("✅ Successfully processed generation request")
119
+
120
+ # Test 2: Verify fallback works
121
+ print("\n2️⃣ Testing fallback generation")
122
+ workflow_no_llm = create_test_workflow(None)
123
+ workflow_no_llm.initialize_session(session_id)
124
+
125
+ fallback_result = workflow_no_llm.process_generation_request(test_request)
126
+ assert fallback_result is not None, "Should get a fallback result"
127
+ assert (
128
+ "fallback" in fallback_result.lower()
129
+ ), "Should indicate fallback mode"
130
+ print("✅ Fallback generation works")
131
+
132
+
133
+ async def main():
134
+ """Run all tests"""
135
+ print("🚀 Starting BaykoWorkflow Tests")
136
+ print("=" * 80)
137
+
138
+ try:
139
+ await test_workflow_initialization()
140
+ await test_session_initialization()
141
+ await test_generation_request()
142
+
143
+ print("\n✨ All tests completed successfully!")
144
+ except AssertionError as e:
145
+ print(f"\n❌ Test failed: {str(e)}")
146
+ except Exception as e:
147
+ print(f"\n❌ Unexpected error: {str(e)}")
148
+
149
+
150
+ if __name__ == "__main__":
151
+ asyncio.run(main())
tests/test_brown_multimodal.py ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Demo: Agent Brown with LlamaIndex ReActAgent
3
+ Shows how the original AgentBrown is used with LlamaIndex tools for hackathon
4
+ """
5
+
6
+ import os
7
+ from dotenv import load_dotenv, find_dotenv
8
+ from agents.brown_workflow import create_brown_workflow
9
+
10
+ load_dotenv(find_dotenv())
11
+
12
+ print("DEBUG: loading .env...")
13
+ load_dotenv()
14
+ print("DEBUG: key =", os.getenv("OPENAI_API_KEY"))
15
+
16
+
17
+ def main():
18
+ """Demo the multimodal Brown agent using your original AgentBrown class"""
19
+
20
+ print("🤖 Agent Brown + LlamaIndex ReActAgent Demo")
21
+ print("=" * 60)
22
+ print("✅ Using your original AgentBrown class with LlamaIndex tools")
23
+ print("🏆 Hackathon demo showcasing ReAct reasoning")
24
+ print("=" * 60)
25
+
26
+ # Check API key
27
+ if not os.getenv("OPENAI_API_KEY"):
28
+ print("❌ Please set OPENAI_API_KEY environment variable")
29
+ print("Example: export OPENAI_API_KEY='your-key-here'")
30
+ return
31
+
32
+ # Create workflow (uses your original AgentBrown internally)
33
+ print("🔧 Creating Brown workflow with LlamaIndex ReActAgent...")
34
+ workflow = create_brown_workflow(max_iterations=3)
35
+ print("✅ Ready!")
36
+
37
+ # Demo prompt
38
+ prompt = """A moody K-pop idol finds a puppy on the street. It changes everything.
39
+ Use Studio Ghibli style with soft colors and 4 panels."""
40
+
41
+ print(f"\n📝 Demo Prompt:")
42
+ print(f"'{prompt}'")
43
+ print("\n🔄 Processing with Agent Brown ReActAgent...")
44
+ print("(Watch for Thought → Action → Observation pattern)")
45
+ print("=" * 60)
46
+
47
+ # Process with visible reasoning
48
+ result = workflow.process_comic_request(prompt)
49
+
50
+ print("\n" + "=" * 60)
51
+ print("🎉 Final Result:")
52
+ print(result)
53
+ print("=" * 60)
54
+ print(
55
+ "✅ Demo complete! Your original AgentBrown methods were used as tools."
56
+ )
57
+
58
+
59
+ if __name__ == "__main__":
60
+ main()
tests/test_complete_integration.py ADDED
@@ -0,0 +1,199 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Complete integration test for Agent Bayko LlamaIndex workflow
4
+ Tests ReActAgent, FunctionTools, Memory, and LLM integration
5
+ """
6
+
7
+ import os
8
+ import asyncio
9
+ import json
10
+ from pathlib import Path
11
+
12
+ # Load environment variables
13
+ try:
14
+ from dotenv import load_dotenv
15
+
16
+ load_dotenv()
17
+ except ImportError:
18
+ pass
19
+
20
+ from agents.bayko_workflow import create_agent_bayko
21
+
22
+
23
+ async def test_complete_integration():
24
+ """Test complete Bayko workflow integration"""
25
+
26
+ print("🏆 Testing Complete Agent Bayko LlamaIndex Integration")
27
+ print("=" * 70)
28
+ print("🎯 Demonstrating: ReActAgent + FunctionTools + Memory + LLM")
29
+ print("=" * 70)
30
+
31
+ # Test 1: Create workflow with LLM
32
+ if os.getenv("OPENAI_API_KEY"):
33
+ print("\n1️⃣ Creating Bayko Workflow with LlamaIndex ReActAgent")
34
+ print("-" * 50)
35
+
36
+ try:
37
+ # Create workflow
38
+ bayko_workflow = create_agent_bayko(
39
+ openai_api_key=os.getenv("OPENAI_API_KEY")
40
+ )
41
+
42
+ # Initialize session
43
+ session_id = "hackathon_demo_001"
44
+ bayko_workflow.initialize_session(session_id)
45
+
46
+ print(
47
+ f"✅ Workflow created with LLM: {bayko_workflow.llm is not None}"
48
+ )
49
+ print(f"✅ ReActAgent created: {bayko_workflow.agent is not None}")
50
+ print(f"✅ Tools available: {len(bayko_workflow.tools)}")
51
+ print(f"✅ Session initialized: {session_id}")
52
+
53
+ # List available tools
54
+ print("\n🛠️ Available LlamaIndex FunctionTools:")
55
+ for i, tool in enumerate(bayko_workflow.tools, 1):
56
+ print(
57
+ f" {i}. {tool.metadata.name}: {tool.metadata.description[:60]}..."
58
+ )
59
+
60
+ # Test 2: Test individual tools
61
+ print("\n2️⃣ Testing Individual LlamaIndex FunctionTools")
62
+ print("-" * 50)
63
+
64
+ # Test enhanced prompt generation
65
+ print("🤖 Testing generate_enhanced_prompt tool...")
66
+ prompt_result = (
67
+ bayko_workflow.bayko_tools.generate_enhanced_prompt_tool(
68
+ description="A melancholic K-pop idol walking in rain",
69
+ style_tags='["whimsical", "soft_lighting", "watercolor"]',
70
+ mood="melancholy",
71
+ )
72
+ )
73
+ prompt_data = json.loads(prompt_result)
74
+ print(
75
+ f"✨ Enhanced prompt: {prompt_data['enhanced_prompt'][:100]}..."
76
+ )
77
+ print(f"🎨 LLM used: {prompt_data['llm_used']}")
78
+
79
+ # Test session info
80
+ print("\n📊 Testing get_session_info tool...")
81
+ session_result = bayko_workflow.bayko_tools.get_session_info_tool()
82
+ session_data = json.loads(session_result)
83
+ print(f"📋 Session ID: {session_data['session_id']}")
84
+ print(f"🧠 Memory size: {session_data['memory_size']}")
85
+ print(f"🤖 LLM available: {session_data['llm_available']}")
86
+
87
+ # Test 3: Test ReActAgent workflow
88
+ print("\n3️⃣ Testing LlamaIndex ReActAgent Workflow")
89
+ print("-" * 50)
90
+
91
+ # Create test request
92
+ test_request = {
93
+ "prompt": prompt_data["enhanced_prompt"],
94
+ "original_prompt": "A melancholic K-pop idol walking in rain",
95
+ "style_tags": ["whimsical", "soft_lighting", "watercolor"],
96
+ "panels": 2,
97
+ "language": "english",
98
+ "extras": ["narration"],
99
+ "session_id": session_id,
100
+ }
101
+
102
+ print("🎯 Processing request through ReActAgent...")
103
+ workflow_result = bayko_workflow.process_generation_request(
104
+ test_request
105
+ )
106
+ print(
107
+ f"🎉 Workflow completed: {len(workflow_result)} chars response"
108
+ )
109
+
110
+ # Test 4: Verify session data persistence
111
+ print("\n4️⃣ Testing Session Data Persistence")
112
+ print("-" * 50)
113
+
114
+ # Check if session directory was created
115
+ session_dir = Path(f"storyboard/{session_id}")
116
+ if session_dir.exists():
117
+ print(f"✅ Session directory created: {session_dir}")
118
+
119
+ # Check for LLM data
120
+ llm_dir = session_dir / "llm_data"
121
+ if llm_dir.exists():
122
+ llm_files = list(llm_dir.glob("*.json"))
123
+ print(f"💾 LLM data files: {len(llm_files)}")
124
+ for file in llm_files:
125
+ print(f" 📄 {file.name}")
126
+ else:
127
+ print("⚠️ No LLM data directory found")
128
+
129
+ # Check for agent data
130
+ agents_dir = session_dir / "agents"
131
+ if agents_dir.exists():
132
+ agent_files = list(agents_dir.glob("*.json"))
133
+ print(f"🤖 Agent data files: {len(agent_files)}")
134
+ for file in agent_files:
135
+ print(f" 📄 {file.name}")
136
+ else:
137
+ print("⚠️ No agents data directory found")
138
+ else:
139
+ print("⚠️ Session directory not found")
140
+
141
+ # Test 5: Memory integration
142
+ print("\n5️⃣ Testing LlamaIndex Memory Integration")
143
+ print("-" * 50)
144
+
145
+ if bayko_workflow.bayko_agent.memory:
146
+ memory_history = (
147
+ bayko_workflow.bayko_agent.memory.get_history()
148
+ )
149
+ print(f"🧠 Memory entries: {len(memory_history)}")
150
+
151
+ # Show recent memory entries
152
+ for i, entry in enumerate(memory_history[-3:], 1):
153
+ print(
154
+ f" {i}. {entry['role']}: {entry['content'][:50]}..."
155
+ )
156
+ else:
157
+ print("⚠️ No memory system found")
158
+
159
+ print("\n🏆 HACKATHON DEMO SUMMARY")
160
+ print("=" * 50)
161
+ print("✅ LlamaIndex ReActAgent: WORKING")
162
+ print("✅ LlamaIndex FunctionTools: WORKING")
163
+ print("✅ LlamaIndex Memory: WORKING")
164
+ print("✅ OpenAI LLM Integration: WORKING")
165
+ print("✅ Session Management: WORKING")
166
+ print("✅ Multi-Agent Workflow: WORKING")
167
+ print("\n🎯 Ready for LlamaIndex Prize Submission!")
168
+
169
+ except Exception as e:
170
+ print(f"❌ Integration test failed: {e}")
171
+ import traceback
172
+
173
+ traceback.print_exc()
174
+
175
+ else:
176
+ print("❌ OPENAI_API_KEY not found - cannot test LLM integration")
177
+ print("🔄 Testing fallback mode...")
178
+
179
+ # Test fallback mode
180
+ bayko_workflow = create_agent_bayko(openai_api_key=None)
181
+ bayko_workflow.initialize_session("fallback_session")
182
+
183
+ print(f"✅ Fallback workflow created")
184
+ print(f"⚠️ LLM available: {bayko_workflow.llm is not None}")
185
+ print(f"⚠️ ReActAgent available: {bayko_workflow.agent is not None}")
186
+
187
+ # Test fallback generation
188
+ test_request = {
189
+ "prompt": "A simple test prompt",
190
+ "panels": 2,
191
+ "session_id": "fallback_session",
192
+ }
193
+
194
+ result = bayko_workflow.process_generation_request(test_request)
195
+ print(f"🔄 Fallback result: {result[:100]}...")
196
+
197
+
198
+ if __name__ == "__main__":
199
+ asyncio.run(test_complete_integration())
tests/test_end_to_end.py ADDED
@@ -0,0 +1,218 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ End-to-end test suite for the multi-agent comic generation system.
3
+ Tests complete workflow including:
4
+ 1. LLM & Tool Integration
5
+ 2. Error Handling
6
+ 3. Content Generation
7
+ 4. Memory & Session Management
8
+ 5. Response Validation
9
+ """
10
+
11
+ import os
12
+ import json
13
+ import pytest
14
+ import asyncio
15
+ import shutil # <-- Add this for cleanup
16
+ from pathlib import Path
17
+ from typing import Dict, Any, List
18
+
19
+ from services.session_id_generator import SessionIdGenerator
20
+ from agents.brown_workflow import BrownWorkflow, create_brown_workflow
21
+ from agents.bayko_workflow import BaykoWorkflow, create_agent_bayko
22
+
23
+
24
+ async def validate_session_artifacts(session_id: str) -> Dict[str, Any]:
25
+ """Validate session artifacts and data persistence."""
26
+ session_dir = Path(f"storyboard/{session_id}")
27
+ artifacts = {
28
+ "exists": session_dir.exists(),
29
+ "llm_data": [],
30
+ "agent_data": [],
31
+ "output": [],
32
+ }
33
+
34
+ if artifacts["exists"]:
35
+ # Check LLM data
36
+ llm_dir = session_dir / "llm_data"
37
+ if llm_dir.exists():
38
+ artifacts["llm_data"] = list(llm_dir.glob("*.json"))
39
+
40
+ # Check agent data
41
+ agents_dir = session_dir / "agents"
42
+ if agents_dir.exists():
43
+ artifacts["agent_data"] = list(agents_dir.glob("*.json"))
44
+
45
+ # Check outputs
46
+ output_dir = session_dir / "output"
47
+ if output_dir.exists():
48
+ artifacts["output"] = list(output_dir.glob("*.*"))
49
+
50
+ return artifacts
51
+
52
+
53
+ def test_comic_generation_flow():
54
+ """Test complete comic generation flow with multiple test cases"""
55
+ session_id = SessionIdGenerator.create_session_id("e2e_test")
56
+ brown_workflow = create_brown_workflow(
57
+ max_iterations=3, openai_api_key=os.getenv("OPENAI_API_KEY")
58
+ )
59
+
60
+ # Test cases to cover different scenarios
61
+ test_cases = [
62
+ {
63
+ "prompt": "A moody K-pop idol finds a puppy on the street. It changes everything.",
64
+ "style": "Studio Ghibli",
65
+ "panels": 4,
66
+ },
67
+ {
68
+ "prompt": "A robot learns to paint in an art studio. Show emotional growth.",
69
+ "style": "watercolor",
70
+ "panels": 3,
71
+ },
72
+ # Test error case
73
+ {
74
+ "prompt": "", # Empty prompt should trigger error
75
+ "style": "invalid_style",
76
+ "panels": -1,
77
+ "expected_error": True,
78
+ },
79
+ ]
80
+
81
+ successes = 0
82
+ failures = 0
83
+
84
+ for case in test_cases:
85
+ print(f"\n🧪 Testing prompt: {case['prompt']}")
86
+ print("=" * 80)
87
+
88
+ try:
89
+ # Handle error test cases
90
+ if case.get("expected_error", False):
91
+ enhanced_prompt = (
92
+ f"{case['prompt']} Style preference: {case['style']}"
93
+ )
94
+ result = asyncio.run(
95
+ brown_workflow.process_comic_request_async(enhanced_prompt)
96
+ )
97
+ assert (
98
+ result["status"] == "error"
99
+ ), "Expected error status for invalid input"
100
+ print("✅ Error handling working as expected")
101
+ successes += 1
102
+ continue
103
+
104
+ # Regular test case
105
+ enhanced_prompt = (
106
+ f"{case['prompt']} Style preference: {case['style']}"
107
+ )
108
+ result = asyncio.run(
109
+ brown_workflow.process_comic_request_async(enhanced_prompt)
110
+ )
111
+
112
+ # Print error details if status is error
113
+ if result.get("status") == "error":
114
+ print(f"❌ Error result: {json.dumps(result, indent=2)}")
115
+
116
+ # Validate result structure
117
+ assert result is not None, "Workflow returned None"
118
+ assert "status" in result, "Missing status in result"
119
+ assert (
120
+ result["status"] == "success"
121
+ ), f"Failed with status: {result['status']}"
122
+ assert "bayko_response" in result, "Missing Bayko response"
123
+
124
+ # Validate Bayko response
125
+ bayko_data = result["bayko_response"]
126
+ assert "panels" in bayko_data, "Missing panels in Bayko response"
127
+ panels = bayko_data["panels"]
128
+
129
+ # Validate panel count and content
130
+ assert len(panels) > 0, "No panels generated"
131
+ assert (
132
+ len(panels) == case["panels"]
133
+ ), f"Expected {case['panels']} panels, got {len(panels)}"
134
+
135
+ for panel in panels:
136
+ assert "description" in panel, "Panel missing description"
137
+
138
+ # Verify session artifacts (just check session dir exists)
139
+ artifacts = asyncio.run(validate_session_artifacts(session_id))
140
+ assert artifacts["exists"], "Session directory not created"
141
+
142
+ print(f"✅ Test case passed: {case['prompt'][:30]}...")
143
+ successes += 1
144
+
145
+ except AssertionError as e:
146
+ print(f"❌ Test failed: {str(e)}")
147
+ failures += 1
148
+ except Exception as e:
149
+ if not case.get("expected_error", False):
150
+ print(f"❌ Unexpected error: {str(e)}")
151
+ failures += 1
152
+ print("\n" + "=" * 80)
153
+ print("🎯 Test Summary")
154
+ print(f"✅ Successful tests: {successes}")
155
+ print(f"❌ Failed tests: {failures}")
156
+ print(f"📊 Success rate: {(successes/(successes+failures))*100:.1f}%")
157
+ try:
158
+ # Final assertions
159
+ assert successes > 0, "No test cases passed"
160
+ assert failures == 0, "Some test cases failed"
161
+ except AssertionError as e:
162
+ print(f"❌ Test assertions failed: {str(e)}")
163
+ raise
164
+ finally:
165
+ # Always cleanup and reset, even if tests fail
166
+ try:
167
+ test_dir = Path(f"storyboard/{session_id}")
168
+ if test_dir.exists():
169
+ shutil.rmtree(test_dir)
170
+ print(f"✨ Cleaned up test directory: {test_dir}")
171
+ except Exception as e:
172
+ print(f"⚠️ Could not clean up test directory: {str(e)}")
173
+
174
+ # Reset session state
175
+ try:
176
+ brown_workflow.reset()
177
+ print("✨ Reset workflow state")
178
+ except Exception as e:
179
+ print(f"⚠️ Could not reset workflow state: {str(e)}")
180
+
181
+
182
+ def main():
183
+ """Run all end-to-end tests"""
184
+ print("\n🤖 Running Comic Generator End-to-End Tests")
185
+ print("=" * 80)
186
+
187
+ if not os.getenv("OPENAI_API_KEY"):
188
+ print("❌ Please set OPENAI_API_KEY environment variable")
189
+ return 1
190
+
191
+ try:
192
+ # Run comic generation flow test
193
+ comic_flow_success = test_comic_generation_flow()
194
+ print(
195
+ "✅ Comic generation flow tests passed"
196
+ if comic_flow_success
197
+ else "❌ Comic generation flow tests failed"
198
+ )
199
+
200
+ # Final summary
201
+ if comic_flow_success:
202
+ print("\n🎉 All tests completed successfully!")
203
+ return 0
204
+ else:
205
+ print("\n⚠️ Some tests failed - see details above")
206
+ return 1
207
+
208
+ except Exception as e:
209
+ print(f"\n💥 Unexpected error during testing: {str(e)}")
210
+ import traceback
211
+
212
+ traceback.print_exc()
213
+ return 2
214
+
215
+
216
+ if __name__ == "__main__":
217
+ exit_code = main()
218
+ exit(exit_code)
tests/test_integration.py ADDED
@@ -0,0 +1,92 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Test script for the integrated memory and evaluation system
4
+ """
5
+
6
+ import asyncio
7
+ from agents.brown import AgentBrown, StoryboardRequest
8
+ from agents.bayko import AgentBayko
9
+
10
+
11
+ async def test_integration():
12
+ """Test the integrated memory and evaluation system"""
13
+
14
+ print("🚀 Testing Bayko & Brown Integration")
15
+ print("=" * 50)
16
+
17
+ # Create agents
18
+ brown = AgentBrown(max_iterations=3)
19
+ bayko = AgentBayko()
20
+
21
+ # Create test request
22
+ request = StoryboardRequest(
23
+ prompt="A cat discovers a magical portal in the garden",
24
+ style_preference="studio_ghibli",
25
+ panels=3,
26
+ language="english",
27
+ extras=["narration", "subtitles"],
28
+ )
29
+
30
+ print(f"📝 User prompt: {request.prompt}")
31
+ print(f"🎨 Style: {request.style_preference}")
32
+ print(f"📊 Panels: {request.panels}")
33
+ print()
34
+
35
+ # Step 1: Brown processes the request
36
+ print("🤖 Brown processing request...")
37
+ brown_message = brown.process_request(request)
38
+
39
+ if brown_message.message_type == "validation_error":
40
+ print(f"❌ Validation failed: {brown_message.payload}")
41
+ return
42
+
43
+ print(f"✅ Brown created request for Bayko")
44
+ print(f"🧠 Brown memory size: {brown.get_session_info()['memory_size']}")
45
+ print()
46
+
47
+ # Step 2: Bayko generates content
48
+ print("🎨 Bayko generating content...")
49
+ bayko_result = await bayko.process_generation_request(
50
+ brown_message.to_dict()
51
+ )
52
+
53
+ print(f"✅ Bayko generated {len(bayko_result.panels)} panels")
54
+ print(f"⏱️ Total time: {bayko_result.total_time:.2f}s")
55
+ print()
56
+
57
+ # Step 3: Brown evaluates the result
58
+ print("🔍 Brown evaluating Bayko's output...")
59
+ review_result = brown.review_output(bayko_result.to_dict(), request)
60
+
61
+ if review_result:
62
+ print(f"📋 Review result: {review_result.message_type}")
63
+ if review_result.message_type == "final_approval":
64
+ print("🎉 Content approved!")
65
+ elif review_result.message_type == "refinement_request":
66
+ print("🔄 Refinement requested")
67
+ else:
68
+ print("❌ Content rejected")
69
+
70
+ print()
71
+ print("📊 Final Session Info:")
72
+ session_info = brown.get_session_info()
73
+ for key, value in session_info.items():
74
+ print(f" {key}: {value}")
75
+
76
+ print()
77
+ print("🧠 Memory Integration Test:")
78
+ if brown.memory:
79
+ history = brown.memory.get_history()
80
+ print(f" Brown memory entries: {len(history)}")
81
+ for i, entry in enumerate(history[-3:], 1): # Show last 3 entries
82
+ print(f" {i}. {entry.role}: {entry.content[:50]}...")
83
+
84
+ if bayko.memory:
85
+ history = bayko.memory.get_history()
86
+ print(f" Bayko memory entries: {len(history)}")
87
+ for i, entry in enumerate(history[-3:], 1): # Show last 3 entries
88
+ print(f" {i}. {entry.role}: {entry.content[:50]}...")
89
+
90
+
91
+ if __name__ == "__main__":
92
+ asyncio.run(test_integration())
tests/test_refactor.py ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Quick test to verify the refactored AgentBrown works
4
+ """
5
+
6
+ from agents.brown import AgentBrown, StoryboardRequest
7
+
8
+
9
+ def test_refactored_brown():
10
+ """Test the refactored AgentBrown"""
11
+ print("🧪 Testing refactored AgentBrown...")
12
+
13
+ # Create agent
14
+ brown = AgentBrown(max_iterations=3)
15
+ print("✅ AgentBrown created successfully")
16
+
17
+ # Create test request
18
+ request = StoryboardRequest(
19
+ prompt="A cat finds a magical book in an old library",
20
+ style_preference="anime",
21
+ panels=3,
22
+ language="english",
23
+ extras=["narration"],
24
+ )
25
+ print("✅ StoryboardRequest created")
26
+
27
+ # Process request
28
+ try:
29
+ message = brown.process_request(request)
30
+ print("✅ Request processed successfully")
31
+ print(f"📨 Generated message ID: {message.message_id}")
32
+ print(f"🎯 Message type: {message.message_type}")
33
+ print(f"📊 Session info: {brown.get_session_info()}")
34
+ return True
35
+ except Exception as e:
36
+ print(f"❌ Error processing request: {e}")
37
+ return False
38
+
39
+
40
+ if __name__ == "__main__":
41
+ success = test_refactored_brown()
42
+ if success:
43
+ print("\n🎉 Refactoring successful! All functionality working.")
44
+ else:
45
+ print("\n💥 Refactoring needs fixes.")
tools/fries.py ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import modal
2
+ import os
3
+ import random
4
+ from services.session_id_generator import SessionIdGenerator
5
+
6
+ # Load environment variables only when running locally (not in Modal's cloud)
7
+ if modal.is_local():
8
+ from dotenv import load_dotenv
9
+
10
+ load_dotenv()
11
+
12
+ # Set Modal credentials from .env file
13
+ modal_token_id = os.environ.get("MODAL_TOKEN_ID")
14
+ modal_token_secret = os.environ.get("MODAL_TOKEN_SECRET")
15
+
16
+ if modal_token_id and modal_token_secret:
17
+ os.environ["MODAL_TOKEN_ID"] = modal_token_id
18
+ os.environ["MODAL_TOKEN_SECRET"] = modal_token_secret
19
+
20
+ # Define the custom Modal image with Python execution requirements
21
+ image = modal.Image.debian_slim().pip_install(
22
+ "mistralai",
23
+ "nest_asyncio",
24
+ "python-dotenv",
25
+ )
26
+
27
+ app = modal.App("fries-coder", image=image)
28
+
29
+
30
+ @app.function(
31
+ secrets=[modal.Secret.from_name("mistral-api")],
32
+ image=image,
33
+ retries=3,
34
+ timeout=300,
35
+ )
36
+ async def generate_and_run_script(prompt: str, session_id: str) -> dict:
37
+ """
38
+ Generate a Python script using Mistral Codestral and run it
39
+
40
+ Args:
41
+ prompt: Description of the script to generate
42
+
43
+ Returns:
44
+ dict with generated code, execution output, and status
45
+ """
46
+ import tempfile
47
+ import subprocess
48
+ from mistralai import Mistral
49
+
50
+ try:
51
+ # Initialize Mistral client
52
+ client = Mistral(api_key=os.environ["MISTRAL_API_KEY"])
53
+
54
+ # Create FIM prompt structure
55
+ prefix = "# Write a short, funny Python script that:\n"
56
+ code_start = "import random\nimport time\n\n"
57
+ suffix = "\n\nif __name__ == '__main__':\n main()"
58
+
59
+ # Generate code using FIM
60
+ response = client.fim.complete(
61
+ model="codestral-latest",
62
+ prompt=f"{prefix}{prompt}\n{code_start}",
63
+ suffix=suffix,
64
+ temperature=0.7,
65
+ top_p=1,
66
+ )
67
+
68
+ generated_code = (
69
+ f"{code_start}{response.choices[0].message.content}{suffix}"
70
+ )
71
+
72
+ # Save to temporary file
73
+ with tempfile.NamedTemporaryFile(
74
+ mode="w", suffix=".py", delete=False
75
+ ) as f:
76
+ f.write(generated_code)
77
+ temp_path = f.name
78
+
79
+ try:
80
+ # Run the generated script
81
+ result = subprocess.run(
82
+ ["python", temp_path],
83
+ capture_output=True,
84
+ text=True,
85
+ timeout=10, # 10 second timeout
86
+ )
87
+ output = result.stdout
88
+ error = None
89
+ status = "success"
90
+ except subprocess.TimeoutExpired:
91
+ output = "Script execution timed out after 10 seconds"
92
+ error = "timeout"
93
+ status = "timeout"
94
+ except subprocess.CalledProcessError as e:
95
+ output = e.stdout
96
+ error = e.stderr
97
+ status = "runtime_error"
98
+ finally:
99
+ # Cleanup
100
+ os.unlink(temp_path)
101
+
102
+ return {
103
+ "code": generated_code,
104
+ "output": output,
105
+ "error": error,
106
+ "status": status,
107
+ }
108
+
109
+ except Exception as e:
110
+ return {
111
+ "code": None,
112
+ "output": None,
113
+ "error": str(e),
114
+ "status": "generation_error",
115
+ }
116
+
117
+
118
+ # Example usage
119
+ @app.local_entrypoint()
120
+ def main(session_id=None):
121
+ if session_id is None:
122
+ session_id = SessionIdGenerator.create_session_id("test")
123
+ animal = random.choice(
124
+ [
125
+ "cat",
126
+ "dog",
127
+ "fish",
128
+ "bird",
129
+ "giraffe",
130
+ "turtle",
131
+ "monkey",
132
+ "rabbit",
133
+ "puppy",
134
+ "animal",
135
+ ]
136
+ )
137
+ prompt = f"""
138
+ create a simple ASCII art of a {animal}.
139
+ Create ASCII art using these characters: _ - = ~ ^ \\\\ / ( ) [ ] {{ }} < > | . o O @ *
140
+ Draw the art line by line with print statements.
141
+ Write a short, funny Python script.
142
+ Use only basic Python features.
143
+ Add a joke or pun about fries in the script.
144
+ Make it light-hearted and fun.
145
+ End with a message about fries.
146
+ Make sure the script runs without errors.
147
+ """
148
+
149
+ try:
150
+ print(f"\n🤖 Generating an ascii {animal} for you!")
151
+ result = generate_and_run_script.remote(prompt, session_id)
152
+
153
+ # print("\n📝 Generated Code:")
154
+ # print("=" * 40)
155
+ # print(result["code"])
156
+ print("=" * 30)
157
+ print("\n🎮 Code Output:")
158
+ print("=" * 30)
159
+ print("\n\n")
160
+ print(result["output"])
161
+
162
+ # print("\n" + "=" * 80)
163
+
164
+ print("🍟 🍟 🍟")
165
+ print("Golden crispy Python fries")
166
+ print("Coming right up!")
167
+ print()
168
+ print("Haha. Just kidding.")
169
+
170
+ if result["code"]:
171
+ # Save the generated code locally
172
+ script_file = f"storyboard/{session_id}/output/fries_for_you.py"
173
+ os.makedirs(os.path.dirname(script_file), exist_ok=True)
174
+ with open(script_file, "w") as f:
175
+ f.write(result["code"])
176
+ print("\nGo here to check out your actual custom code:")
177
+ print(f"👉 Code saved to: {script_file}")
178
+ print("\n\n\n")
179
+
180
+ if result["error"]:
181
+ print("\n❌ Error:")
182
+ print("=" * 40)
183
+ print(result["error"])
184
+
185
+ if result["error"]:
186
+ print("Looks like there was an error during execution.")
187
+ print("Here are some extra fries to cheer you up!")
188
+ print("🍟 🍟 🍟")
189
+ print(" 🍟 🍟 ")
190
+ print(" 🍟 ")
191
+ print("Now with extra machine-learned crispiness.")
192
+
193
+ except modal.exception.FunctionTimeoutError:
194
+ print("⏰ Script execution timed out after 300 seconds and 3 tries!")
195
+ print("Sorry but codestral is having a hard time drawing today.")
196
+ print("Here's a timeout fry for you! 🍟")
197
+ print("Here are some extra fries to cheer you up!")
198
+ print("🍟 🍟 🍟")
199
+ print(" 🍟 🍟 ")
200
+ print(" 🍟 ")
201
+ print("Now with extra machine-learned crispiness.")
tools/fries_mcp.json ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "schema_version": "1.0",
3
+ "name": "fries_script_generator",
4
+ "description": "Generate and run short, funny Python scripts using Codestral via Modal",
5
+ "version": "1.0.0",
6
+ "input_schema": {
7
+ "type": "object", "properties": {
8
+ "prompt": {
9
+ "type": "string",
10
+ "description": "A natural language description of the script to generate",
11
+ "minLength": 1,
12
+ "maxLength": 1000
13
+ }
14
+ },
15
+ "required": ["prompt"]
16
+ },
17
+ "output_schema": {
18
+ "type": "object",
19
+ "properties": {
20
+ "code": {
21
+ "type": "string",
22
+ "description": "The generated Python code"
23
+ },
24
+ "output": {
25
+ "type": "string",
26
+ "description": "The stdout output from running the generated code"
27
+ },
28
+ "error": {
29
+ "type": "string",
30
+ "description": "Error message if script failed or timed out"
31
+ },
32
+ "status": {
33
+ "type": "string",
34
+ "description": "Status of execution: success, timeout, runtime_error, generation_error"
35
+ }
36
+ },
37
+ "required": ["code", "output", "status"]
38
+ }
39
+ }
tools/image_generator.py ADDED
@@ -0,0 +1,110 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import modal
2
+ import io
3
+ import os
4
+ from services.session_id_generator import SessionIdGenerator
5
+
6
+ # Define the Modal image with required dependencies
7
+ image = modal.Image.debian_slim().pip_install(
8
+ "diffusers",
9
+ "transformers",
10
+ "torch",
11
+ "safetensors",
12
+ "accelerate",
13
+ "Pillow",
14
+ "python-dotenv", # Add this for .env support
15
+ )
16
+
17
+ # Load environment variables only when running locally (not in Modal's cloud)
18
+ if modal.is_local():
19
+ from dotenv import load_dotenv
20
+
21
+ load_dotenv()
22
+
23
+ # Set Modal credentials from .env file
24
+ modal_token_id = os.environ.get("MODAL_TOKEN_ID")
25
+ modal_token_secret = os.environ.get("MODAL_TOKEN_SECRET")
26
+
27
+ if modal_token_id and modal_token_secret:
28
+ os.environ["MODAL_TOKEN_ID"] = modal_token_id
29
+ os.environ["MODAL_TOKEN_SECRET"] = modal_token_secret
30
+
31
+
32
+ app = modal.App("comic-image-generator", image=image)
33
+
34
+
35
+ class ComicImageGenerator:
36
+ def __init__(self):
37
+ from diffusers import AutoPipelineForText2Image
38
+ import torch
39
+
40
+ self.pipe = AutoPipelineForText2Image.from_pretrained(
41
+ "stabilityai/sdxl-turbo",
42
+ torch_dtype=torch.float16,
43
+ variant="fp16",
44
+ )
45
+ self.pipe.to("cuda")
46
+ self.torch = torch
47
+
48
+ def generate_comic_panel(
49
+ self,
50
+ prompt: str,
51
+ panel_id: int,
52
+ session_id: str,
53
+ steps: int = 1,
54
+ seed: int = 42,
55
+ ) -> tuple:
56
+ import time
57
+
58
+ generator = self.torch.manual_seed(seed)
59
+ start = time.time()
60
+ result = self.pipe(
61
+ prompt=prompt,
62
+ generator=generator,
63
+ num_inference_steps=steps,
64
+ guidance_scale=0.0,
65
+ width=512,
66
+ height=512,
67
+ output_type="pil",
68
+ )
69
+ duration = time.time() - start
70
+ image = result.images[0]
71
+ buf = io.BytesIO()
72
+ image.save(buf, format="PNG")
73
+ buf.seek(0)
74
+ return buf.read(), duration
75
+
76
+
77
+ @app.function(image=image, gpu="A10G")
78
+ def generate_comic_panel(
79
+ prompt: str,
80
+ panel_id: int,
81
+ session_id: str,
82
+ steps: int = 1,
83
+ seed: int = 42,
84
+ ) -> tuple:
85
+ generator = ComicImageGenerator()
86
+ return generator.generate_comic_panel(
87
+ prompt, panel_id, session_id, steps, seed
88
+ )
89
+
90
+
91
+ @app.local_entrypoint()
92
+ def main():
93
+ prompt = "A K-pop idol walking through a rainy Seoul street, whimsical, soft lighting, watercolor style"
94
+ panel_id = 1
95
+ session_id = SessionIdGenerator.create_session_id("test")
96
+ steps = 1 # SDXL Turbo works best with 1 step
97
+ seed = 42
98
+
99
+ # with app.run():
100
+ img_bytes, duration = generate_comic_panel.remote(
101
+ prompt, panel_id, session_id, steps, seed
102
+ )
103
+ # Save to local directory
104
+ out_dir = f"storyboard/{session_id}/content"
105
+ os.makedirs(out_dir, exist_ok=True)
106
+ out_path = f"{out_dir}/panel_{panel_id}.png"
107
+ with open(out_path, "wb") as f:
108
+ f.write(img_bytes)
109
+ print("✅ Image generated at:", out_path)
110
+ print("🕒 Time taken:", duration, "seconds")
tools/scraps/ascii_art.py ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from PIL import Image
2
+
3
+ # Load the image
4
+ img = Image.open("storyboard/test_session_modal_image_gen/content/panel_1.png")
5
+
6
+ # Resize the image
7
+ width, height = 100, 60
8
+ img = img.resize((width, height))
9
+
10
+ # Convert the image to grayscale
11
+ img = img.convert("L")
12
+
13
+ # ASCII characters used to represent the image
14
+ ascii_chars = "@%#*+=-:. "
15
+
16
+ # Convert the image to ASCII art
17
+ ascii_str = ""
18
+ for y in range(height):
19
+ for x in range(width):
20
+ pixel = img.getpixel((x, y))
21
+ ascii_str += ascii_chars[pixel // 32]
22
+ ascii_str += "\n"
23
+
24
+ # Print the ASCII art
25
+ print(ascii_str)
tools/scraps/cooking_fries.py ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import time
2
+ import os
3
+
4
+ # Define ASCII art frames for fries cooking
5
+ frames = [
6
+ r"""
7
+ __________
8
+ | |
9
+ | |
10
+ | |
11
+ | |
12
+ |__________|
13
+ (\_/)
14
+ """,
15
+ r"""
16
+ __________
17
+ | ~~~~~~~~ |
18
+ | ~~~~~~~~ |
19
+ | | | | |
20
+ | |__|__|__|
21
+ |__________|
22
+ (\_/) O
23
+ """,
24
+ r"""
25
+ __________
26
+ | ~~~~~~~~ |
27
+ | ~~~~~~~~ |
28
+ | | | | |
29
+ | |__|__|__|
30
+ |__________|
31
+ (\_/)=
32
+ """,
33
+ r"""
34
+ __________
35
+ | ~~~~~~~~ |
36
+ | ~~~~~~~~ |
37
+ | | | | |
38
+ | |__|__|__|
39
+ |__________|
40
+ (\_/)
41
+ """,
42
+ ]
43
+
44
+
45
+ # Function to clear the terminal screen
46
+ def clear_screen():
47
+ os.system("cls" if os.name == "nt" else "clear")
48
+
49
+
50
+ # Display each frame in sequence to create an animation
51
+ while True:
52
+ for frame in frames:
53
+ clear_screen()
54
+ print("Cooking fries...")
55
+ print(frame)
56
+ time.sleep(0.5)
tools/scraps/cool_script.py ADDED
@@ -0,0 +1,77 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import time
2
+ import random
3
+
4
+ def print_with_delay(text, delay=0.05):
5
+ for char in text:
6
+ print(char, end='', flush=True)
7
+ time.sleep(delay)
8
+ print()
9
+
10
+ def main():
11
+ print_with_delay("Initializing time-space continuum...")
12
+ time.sleep(1)
13
+
14
+ print("""
15
+ .-""""""-.
16
+ / \\
17
+ | |
18
+ | |
19
+ |,.-""""-,. |
20
+ \\/______\\/
21
+ | __ |
22
+ | |..| |
23
+ | |__| |
24
+ `----`
25
+ """)
26
+
27
+ print_with_delay("Detected entity: Existential Toaster")
28
+ time.sleep(1)
29
+
30
+ print_with_delay("The toaster contemplates its existence...")
31
+ time.sleep(2)
32
+
33
+ print_with_delay("Toaster: 'Why must I only toast bread? Is there more to life?'")
34
+ time.sleep(2)
35
+
36
+ print_with_delay("Suddenly, a synthwave concert appears in the time-space continuum...")
37
+ time.sleep(1)
38
+
39
+ print("""
40
+ .-""""""-.
41
+ / \\
42
+ | 🎵🎶🎵 |
43
+ | Synthwave |
44
+ | Concert 🎸 |
45
+ |,.-""""-,. |
46
+ \\/______\\/
47
+ | __ |
48
+ | |..| |
49
+ | |__| |
50
+ `----`
51
+ """)
52
+
53
+ print_with_delay("The toaster begins to vibrate...")
54
+ time.sleep(2)
55
+
56
+ print_with_delay("Toaster: 'I... I feel alive! The music... it speaks to me!'")
57
+ time.sleep(2)
58
+
59
+ print_with_delay("The toaster starts to glow and lift off the ground...")
60
+ time.sleep(2)
61
+
62
+ print_with_delay("Toaster: 'I am more than a toaster! I am a traveler of time and space!'")
63
+ time.sleep(2)
64
+
65
+ print_with_delay("The toaster vanishes into the synthwave concert, leaving behind a trail of breadcrumbs...")
66
+ time.sleep(2)
67
+
68
+ print_with_delay("The concert fades away, and the time-space continuum returns to normal...")
69
+ time.sleep(1)
70
+
71
+ print_with_delay("Log entry: The Existential Toaster has transcended its purpose.")
72
+ time.sleep(1)
73
+
74
+ print_with_delay("End of transmission.")
75
+
76
+ if __name__ == "__main__":
77
+ main()
tools/scraps/dancing_animation.py ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import time
2
+ import os
3
+
4
+ # Define ASCII art frames for a simple dancing figure
5
+ frames = [
6
+ r"""
7
+ O
8
+ /|\
9
+ / \
10
+ """,
11
+ r"""
12
+ O
13
+ /|\
14
+ / \
15
+ | O
16
+ / \
17
+ """,
18
+ r"""
19
+ O
20
+ /|\
21
+ / \
22
+ O
23
+ / \
24
+ """,
25
+ ]
26
+
27
+
28
+ # Function to clear the terminal screen
29
+ def clear_screen():
30
+ os.system("cls" if os.name == "nt" else "clear")
31
+
32
+
33
+ # Display each frame in sequence to create an animation
34
+ while True:
35
+ for frame in frames:
36
+ clear_screen()
37
+ print(frame)
38
+ time.sleep(0.3)
tools/scraps/mandelbrot.py ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import matplotlib.pyplot as plt
2
+ import numpy as np
3
+
4
+
5
+ def mandelbrot(h, w, max_iter=1000):
6
+ y, x = np.ogrid[-1.5 : 1.5 : h * 1j, -2 : 1 : w * 1j]
7
+ c = x + y * 1j
8
+ z = c
9
+ divtime = max_iter + np.zeros(z.shape, dtype=int)
10
+
11
+ for i in range(max_iter):
12
+ z = z**2 + c
13
+ diverge = z * np.conj(z) > 4
14
+ newly_diverged = diverge & (divtime == max_iter)
15
+ divtime[newly_diverged] = i
16
+ z[diverge] = 2
17
+
18
+ return divtime
19
+
20
+
21
+ plt.imshow(
22
+ mandelbrot(400, 400), cmap="twilight_shifted", extent=(-2, 1, -1.5, 1.5)
23
+ )
24
+ plt.colorbar()
25
+ plt.title("Mandelbrot Set")
26
+ plt.show()
tools/scraps/ping_pong.py ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import time
2
+ import os
3
+ import keyboard
4
+
5
+ # Initialize the game variables
6
+ paddle_position = 3
7
+ ball_position_x = 10
8
+ ball_position_y = 5
9
+ ball_direction_x = -1
10
+ ball_direction_y = 1
11
+ score = 0
12
+
13
+
14
+ # Function to clear the terminal screen
15
+ def clear_screen():
16
+ os.system("cls" if os.name == "nt" else "clear")
17
+
18
+
19
+ # Function to draw the game
20
+ def draw_game():
21
+ clear_screen()
22
+ # Draw the top and bottom boundaries
23
+ print("+----------------------+")
24
+ # Draw the game area
25
+ for y in range(10):
26
+ line = "|"
27
+ for x in range(20):
28
+ if x == 0 and y == paddle_position:
29
+ line += "|"
30
+ elif x == ball_position_x and y == ball_position_y:
31
+ line += "O"
32
+ else:
33
+ line += " "
34
+ line += "|"
35
+ print(line)
36
+ # Draw the bottom boundary
37
+ print("+----------------------+")
38
+ print(f"Score: {score}")
39
+
40
+
41
+ # Main game loop
42
+ try:
43
+ while True:
44
+ # Move the ball
45
+ ball_position_x += ball_direction_x
46
+ ball_position_y += ball_direction_y
47
+
48
+ # Ball collision with top and bottom
49
+ if ball_position_y <= 0 or ball_position_y >= 9:
50
+ ball_direction_y *= -1
51
+
52
+ # Ball collision with paddle
53
+ if ball_position_x <= 0 and ball_position_y == paddle_position:
54
+ ball_direction_x *= -1
55
+ score += 1
56
+
57
+ # Ball out of bounds
58
+ if ball_position_x < 0:
59
+ ball_position_x = 10
60
+ ball_position_y = 5
61
+ score -= 1
62
+
63
+ # Ball collision with right wall
64
+ if ball_position_x >= 19:
65
+ ball_direction_x *= -1
66
+
67
+ # Move the paddle with keyboard input
68
+ if keyboard.is_pressed("up") and paddle_position > 0:
69
+ paddle_position -= 1
70
+ if keyboard.is_pressed("down") and paddle_position < 9:
71
+ paddle_position += 1
72
+
73
+ # Draw the game
74
+ draw_game()
75
+
76
+ # Control the game speed
77
+ time.sleep(0.2)
78
+ except KeyboardInterrupt:
79
+ print("\nGame Over")
tools/scraps/text_adventure.py ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import random
2
+ import time
3
+
4
+
5
+ def text_based_adventure():
6
+ rooms = {
7
+ "Entrance": {
8
+ "description": "You are at the entrance of an ancient temple.",
9
+ "exits": ["Hall", "Garden"],
10
+ "item": None,
11
+ },
12
+ "Hall": {
13
+ "description": "A grand hall with torches lighting the way.",
14
+ "exits": ["Entrance", "Chamber", "Kitchen"],
15
+ "item": "torch",
16
+ },
17
+ "Garden": {
18
+ "description": "A serene garden with a fountain in the center.",
19
+ "exits": ["Entrance", "Library"],
20
+ "item": "key",
21
+ },
22
+ "Chamber": {
23
+ "description": "A dark chamber with a mysterious aura.",
24
+ "exits": ["Hall"],
25
+ "item": "treasure",
26
+ },
27
+ "Kitchen": {
28
+ "description": "An old kitchen with rusted utensils.",
29
+ "exits": ["Hall"],
30
+ "item": "knife",
31
+ },
32
+ "Library": {
33
+ "description": "A dusty library filled with ancient books.",
34
+ "exits": ["Garden"],
35
+ "item": "book",
36
+ },
37
+ }
38
+
39
+ inventory = []
40
+ current_room = "Entrance"
41
+ path = [current_room]
42
+ log = []
43
+
44
+ log.append(f"Starting adventure in the {current_room}.")
45
+ log.append(rooms[current_room]["description"])
46
+
47
+ while True:
48
+ if rooms[current_room]["item"]:
49
+ item = rooms[current_room]["item"]
50
+ inventory.append(item)
51
+ log.append(f"You found a {item}!")
52
+ rooms[current_room]["item"] = None
53
+
54
+ if "treasure" in inventory:
55
+ log.append(
56
+ "Congratulations! You found the treasure and won the game!"
57
+ )
58
+ break
59
+
60
+ current_room = random.choice(rooms[current_room]["exits"])
61
+ if current_room not in path:
62
+ path.append(current_room)
63
+ log.append(f"Moving to the {current_room}.")
64
+ log.append(rooms[current_room]["description"])
65
+ time.sleep(1)
66
+
67
+ return "\n".join(log)
68
+
69
+
70
+ # Run the adventure
71
+ adventure_log = text_based_adventure()
72
+ print(adventure_log)
tools/sdxl_generator_spec.json ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "schema_version": "1.0",
3
+ "name": "sdxl_turbo_generator",
4
+ "description": "Generate comic panel images using SDXL Turbo model via Modal compute",
5
+ "version": "1.0.0",
6
+ "input_schema": {
7
+ "type": "object",
8
+ "properties": {
9
+ "prompt": {
10
+ "type": "string",
11
+ "description": "The text prompt for image generation"
12
+ },
13
+ "panel_id": {
14
+ "type": "integer",
15
+ "description": "Unique identifier for the comic panel"
16
+ },
17
+ "session_id": {
18
+ "type": "string",
19
+ "description": "Session identifier for grouping related panels"
20
+ },
21
+ "steps": {
22
+ "type": "integer",
23
+ "description": "Number of inference steps (default: 1)",
24
+ "default": 1
25
+ },
26
+ "seed": {
27
+ "type": "integer",
28
+ "description": "Random seed for reproducibility (default: 42)",
29
+ "default": 42
30
+ }
31
+ },
32
+ "required": ["prompt", "panel_id", "session_id"]
33
+ },
34
+ "output_schema": {
35
+ "type": "object",
36
+ "properties": {
37
+ "image_bytes": {
38
+ "type": "string",
39
+ "description": "Base64 encoded PNG image data",
40
+ "contentEncoding": "base64"
41
+ },
42
+ "duration": {
43
+ "type": "number",
44
+ "description": "Time taken to generate the image in seconds"
45
+ }
46
+ },
47
+ "required": ["image_bytes", "duration"]
48
+ }
49
+ }