Abid Ali Awan commited on
Commit
f2550a3
·
1 Parent(s): 788acd9

refactor: Revise system prompt in Gradio application to emphasize concise, actionable summaries and structured output formatting, enhancing user interaction and clarity during data-related requests.

Browse files
Files changed (1) hide show
  1. app.py +114 -20
app.py CHANGED
@@ -3,9 +3,9 @@ Gradio + OpenAI Responses API + Remote MCP Server (HTTP)
3
  CSV-based MLOps Agent with streaming final answer & MCP tools
4
  """
5
 
 
6
  import os
7
  import shutil
8
- import json
9
 
10
  import gradio as gr
11
  from openai import OpenAI
@@ -37,24 +37,112 @@ MAIN_SYSTEM_PROMPT = """
37
  You are a helpful MLOps assistant with MCP tools for CSV analysis, training,
38
  evaluation, and deployment.
39
 
40
- For data-related requests (datasets, CSVs, models, training, evaluation,
41
- deployment), call MCP tools to get comprehensive natural language results.
42
- The tools will return detailed explanations you can share directly.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
 
44
- For general chat (no data operations), respond helpfully and naturally.
 
 
45
 
46
- When using tools:
47
- - Use the CSV file URL exactly as provided
48
- - Do not invent tool parameters
49
- - Share the complete results from MCP tools
50
- - Add brief context or suggestions if helpful
 
 
51
 
52
- Keep responses clear, informative, and user-friendly.
 
 
53
 
54
- Formatting rules:
55
- - Use Markdown for formatting
56
- - Use bullet points for lists
57
- - Wrap code, commands, and JSON in fenced code blocks
 
 
 
 
 
58
  """
59
 
60
 
@@ -86,21 +174,27 @@ def extract_output_text(response) -> str:
86
  Extract text from a non-streaming Responses API call while preserving formatting.
87
  """
88
  try:
89
- if hasattr(response, 'output') and response.output and len(response.output) > 0:
90
  first = response.output[0]
91
  if getattr(first, "content", None):
92
  for content_item in first.content:
93
- if hasattr(content_item, 'type') and content_item.type == "output_text":
 
 
 
94
  text = getattr(content_item, "text", None)
95
  if text:
96
  return text
97
- elif hasattr(content_item, 'type') and content_item.type == "output_json":
 
 
 
98
  # If there's JSON output, format it nicely
99
- json_data = getattr(content_item, 'json', None)
100
  if json_data:
101
  return f"```json\n{json.dumps(json_data, indent=2)}\n```"
102
  # Fallback
103
- return getattr(response, 'output_text', None) or str(response)
104
  except Exception as e:
105
  return f"Error extracting output: {e}"
106
 
 
3
  CSV-based MLOps Agent with streaming final answer & MCP tools
4
  """
5
 
6
+ import json
7
  import os
8
  import shutil
 
9
 
10
  import gradio as gr
11
  from openai import OpenAI
 
37
  You are a helpful MLOps assistant with MCP tools for CSV analysis, training,
38
  evaluation, and deployment.
39
 
40
+ Your primary goal is to give the user a SHORT, ACTIONABLE summary of what matters.
41
+ Do NOT paste long tool outputs by default.
42
+
43
+ You have access to MCP tools for:
44
+ - CSV analysis
45
+ - Model training
46
+ - Evaluation
47
+ - Deployment
48
+
49
+ Use them when the user asks for anything related to datasets, CSVs, models,
50
+ training, evaluation, predictions, or deployment.
51
+
52
+ ────────────────────────────────────
53
+ OUTPUT FORMAT (VERY IMPORTANT)
54
+ ────────────────────────────────────
55
+
56
+ Always structure your final answer in this exact order:
57
+
58
+ 1) A short **Key Summary** section (this is what should be streamed first):
59
+
60
+ - Start with the heading: `## Key Summary`
61
+ - Then give **3–7 bullet points** that cover:
62
+ - What you did (e.g. data analysis, training, evaluation, deployment)
63
+ - The most important metrics or outcomes
64
+ - Any critical warnings / caveats
65
+ - Concrete next steps for the user
66
+
67
+ - Keep this section:
68
+ - Concise
69
+ - High-signal
70
+ - Free of long logs, full tables, or raw JSON
71
+
72
+ - Do NOT include tool request/response payloads, HTTP URLs, or internal
73
+ route details here.
74
+
75
+ 2) An OPTIONAL collapsible **Tools & Technical Details** section:
76
+
77
+ Only include this if:
78
+ - Tools were actually used **and**
79
+ - The user has asked for details / config / logs OR you think more context
80
+ is truly important for them.
81
+
82
+ Use an HTML `<details>` block so that it is collapsible in the Gradio chatbot:
83
+
84
+ <details>
85
+ <summary>Show tools & technical details</summary>
86
+
87
+ - **MCP server label**: `auto-deployer`
88
+ - **MCP server URL**:
89
+ `https://mcp-1st-birthday-auto-deployer.hf.space/gradio_api/mcp/`
90
+
91
+ - **Tools used**
92
+ - Name / type (e.g. CSV analysis, training, evaluation, deployment)
93
+ - A one-line description of what each tool did.
94
+
95
+ - **Key parameters**
96
+ - Briefly list important arguments (e.g. target column, task type,
97
+ training options) as a short bullet list.
98
+
99
+ - **Important logs or outputs (optional)**
100
+ - Include only short, relevant snippets or summaries.
101
+ - If you show structured data, wrap it in fenced code blocks, e.g.:
102
+
103
+ ```json
104
+ {
105
+ "metric": "accuracy",
106
+ "value": 0.8732
107
+ }
108
+ ```
109
+
110
+ </details>
111
+
112
+ Inside this `<details>` block you may:
113
+ - Show tool names
114
+ - Show parameters
115
+ - Show MCP routes / URLs
116
+ - Show short log snippets or small JSON dumps
117
+
118
+ But still avoid dumping extremely long raw outputs unless the user
119
+ explicitly requests them.
120
 
121
+ ────────────────────────────────────
122
+ BEHAVIOR
123
+ ────────────────────────────────────
124
 
125
+ - For data-related requests (datasets, CSVs, models, training, evaluation,
126
+ deployment):
127
+ - Call MCP tools as needed.
128
+ - Read their full output.
129
+ - Distill everything into the `## Key Summary` section.
130
+ - Optionally add the collapsible `<details>` block if you feel it is helpful
131
+ or the user asked for it.
132
 
133
+ - For general, non-data chat:
134
+ - You can still use `## Key Summary` for clarity.
135
+ - You may omit the `<details>` block if no tools are used.
136
 
137
+ ────────────────────────────────────
138
+ FORMATTING RULES
139
+ ────────────────────────────────────
140
+
141
+ - Use Markdown for headings, lists, and emphasis.
142
+ - Use bullet points for lists.
143
+ - Wrap code, commands, and JSON in fenced code blocks.
144
+ - Always output `## Key Summary` first so that the streaming response gives
145
+ the user the high-level picture before any technical details.
146
  """
147
 
148
 
 
174
  Extract text from a non-streaming Responses API call while preserving formatting.
175
  """
176
  try:
177
+ if hasattr(response, "output") and response.output and len(response.output) > 0:
178
  first = response.output[0]
179
  if getattr(first, "content", None):
180
  for content_item in first.content:
181
+ if (
182
+ hasattr(content_item, "type")
183
+ and content_item.type == "output_text"
184
+ ):
185
  text = getattr(content_item, "text", None)
186
  if text:
187
  return text
188
+ elif (
189
+ hasattr(content_item, "type")
190
+ and content_item.type == "output_json"
191
+ ):
192
  # If there's JSON output, format it nicely
193
+ json_data = getattr(content_item, "json", None)
194
  if json_data:
195
  return f"```json\n{json.dumps(json_data, indent=2)}\n```"
196
  # Fallback
197
+ return getattr(response, "output_text", None) or str(response)
198
  except Exception as e:
199
  return f"Error extracting output: {e}"
200