Spaces:

jkorstad
/

Easy-Spaces

Runtime error

App Files Files Community

jkorstad commited on May 17

Commit

903bdac

verified ·

1 Parent(s): 1ab5d95

Update app.py

Browse files

Files changed (1) hide show

app.py +29 -22

app.py CHANGED Viewed

@@ -108,7 +108,6 @@ tools.append(space_search_tool)
 model = InferenceClientModel(model_id="Qwen/Qwen2.5-Coder-32B-Instruct")
 # Create the agent
-# Removed python_globals from constructor
 agent = CodeAgent(
     tools=tools,
     model=model,
@@ -117,6 +116,7 @@ agent = CodeAgent(
 )
 AGENT_INSTRUCTIONS = """You are a highly capable AI assistant. Your primary goal is to accomplish tasks using a variety of tools, prioritizing Hugging Face Spaces.
 Follow these steps:
 1.  **Understand the Request:** Carefully analyze the user's prompt. Identify the core task and any specific requirements or inputs.
 2.  **Check Predefined Tools:** Review your list of available tools. If a predefined tool can directly address the request, use it.
@@ -128,10 +128,12 @@ Follow these steps:
 5.  **Execute the Tool:** Call the tool (predefined, or dynamically created via `Tool.from_space()`) with the necessary arguments.
     * **File Inputs:** If the user uploads files, their paths will be available as global string variables: `input_image_path`, `input_audio_path`, `input_video_path`, `input_3d_model_path`, `input_file_path`. Before using these variables, check if they exist and are not None. Pass these file paths as arguments to tools that require them.
     * **Imports in Generated Code:** If your code block for execution uses modules like `os` or `uuid`, **you must include the import statements (e.g., `import os`, `import uuid`) within that specific code block.**
-6.  **Output Management:**
-    * **If a tool returns a filepath string (e.g., to an image, audio, or other file), your final answer for this step should usually be that direct filepath string.** Do NOT attempt to re-save the file using `os.path.join` or `image.save()` unless you are performing an explicit transformation on the file content that requires loading and then saving. The system is designed to handle these returned filepaths.
-    * If a tool returns text, return that text.
 7.  **Clarity and Error Handling:** If you encounter issues (e.g., a Space tool fails, required inputs are missing), clearly explain the problem in your response. If a Space doesn't work, try to explain why or suggest an alternative if possible.
 Example of the **CORRECT AND PREFERRED** way to use a discovered Space:
 ```python
 # User prompt: "Find a space that can make an image of a cat and use it."
@@ -146,20 +148,21 @@ Example of the **CORRECT AND PREFERRED** way to use a discovered Space:
 #     # Now use the newly created tool. Arguments depend on the Space's API.
 #     # Let's assume it takes a 'prompt'.
 #     image_filepath = cat_tool(prompt="A fluffy siamese cat, cyberpunk style")
-#     return image_filepath # Return the filepath directly
 # except Exception as e:
 #     print(f"Failed to create or use tool from Space 'someuser/cat-image-generator' using Tool.from_space(): {e}")
 #     # If Tool.from_space() fails, DO NOT immediately try gradio_client.Client().
 #     # Instead, consider another space or a predefined tool.
 #     # return "Could not use the discovered space via Tool.from_space(). Trying a fallback..." (then try another step)
 ```
 Example of using a predefined tool that returns a filepath:
 ```python
 # User prompt: "Generate an image of a happy robot."
 # (Assuming 'image_generator_flux_schnell' is a predefined tool)
 #
 # image_filepath = image_generator_flux_schnell(prompt="A happy robot coding on a laptop, cyberpunk style")
-# return image_filepath # Return the filepath string directly.
 ```
 Always ensure your generated Python code is complete and directly callable.
 You have access to `PIL.Image` (as `Image`), `os`, `sys`, `numpy`, `huggingface_hub`, `gradio_client`, `uuid`. Remember to import them if you use them in a code block.
@@ -168,7 +171,8 @@ You have access to `PIL.Image` (as `Image`), `os`, `sys`, `numpy`, `huggingface_
 # Gradio interface function
 def gradio_interface(user_prompt, input_image_path, input_audio_path, input_video_path, input_3d_model_path, input_file_path, progress=gr.Progress(track_tqdm=True)):
     try:
-        progress(0, desc="Initializing Agent...")
         full_prompt_with_instructions = f"{AGENT_INSTRUCTIONS}\n\nUSER PROMPT: {user_prompt}"
         dynamic_globals_for_run = {}
@@ -210,9 +214,7 @@ def gradio_interface(user_prompt, input_image_path, input_audio_path, input_vide
                 print(f"Restored agent.python_interpreter.globals.")
             else:
                 print("Warning: Could not restore python_interpreter globals.")
-        progress(0.8, desc="Processing result...")
         outputs = {
             "image": gr.update(value=None, visible=False), "file": gr.update(value=None, visible=False),
             "path": gr.update(value=None, visible=False), "audio": gr.update(value=None, visible=False),
@@ -233,13 +235,15 @@ def gradio_interface(user_prompt, input_image_path, input_audio_path, input_vide
         elif result is None: outputs["text"] = gr.update(value="Agent returned no result (None).", visible=True)
         else: outputs["text"] = gr.update(value=f"Unexpected result type: {type(result)}. Content: {str(result)}", visible=True)
-        progress(1, desc="Done!")
         return (outputs["image"], outputs["file"], outputs["path"], outputs["audio"], outputs["model3d"], outputs["text"])
     except Exception as e:
         error_msg = f"An error occurred: {str(e)}"
         print(error_msg)
         traceback.print_exc()
         return (None, None, None, None, None, gr.update(value=error_msg, visible=True))
 # Create the Gradio app
@@ -251,14 +255,18 @@ with gr.Blocks(theme=gr.themes.Soft()) as app:
         prompt_input = gr.Textbox(label="Enter your prompt", placeholder="e.g., 'Generate an image of a futuristic city'", lines=3, elem_id="user_prompt_textbox")
     with gr.Accordion("Optional File Inputs", open=False):
-        with gr.Row():
-            input_image = gr.Image(label="Image Input", type="filepath", sources=["upload", "clipboard"], elem_id="input_image_upload")
-            input_audio = gr.Audio(label="Audio Input", type="filepath", sources=["upload", "microphone"], elem_id="input_audio_upload")
-        with gr.Row():
-            input_video = gr.Video(label="Video Input", sources=["upload"], elem_id="input_video_upload")
-            input_model3d = gr.Model3D(label="3D Model Input", elem_id="input_model3d_upload")
-        with gr.Row():
-            input_file = gr.File(label="Generic File Input", type="filepath", elem_id="input_file_upload")
     submit_button = gr.Button("🚀 Generate", variant="primary", elem_id="submit_button_generate")
@@ -283,11 +291,10 @@ with gr.Blocks(theme=gr.themes.Soft()) as app:
         examples=[
             ["Generate an image of a happy robot coding on a laptop, cyberpunk style.", None, None, None, None, None],
             ["Convert the following text to speech: 'Smolagents are amazing for building AI applications.'", None, None, None, None, None],
-            ["Search for a Hugging Face Space that can perform image captioning. Describe the first result.", None, None, None, None, None],
             ["I have an image of a robot. Make this image Ghibli style.", "Happy Robot Coding.webp", None, None, None, None],
             ["Generate an EDM jazz song about a futuristic city.", None, None, None, None, None],
-            ["Extract text from the uploaded PDF file. (Upload a PDF)", None, None, None, None, None], # User would replace path or upload
-            ["Search for a Hugging Face Space that can translate English to Spanish, then use it to translate: 'Good morning, how are you?'", None, None, None, None, None],
         ],
         inputs=[prompt_input, input_image, input_audio, input_video, input_model3d, input_file],
         label="Example Prompts (Note: For examples with file inputs, you'll need to upload a relevant file first or ensure the named file exists in the Space's root)"

 model = InferenceClientModel(model_id="Qwen/Qwen2.5-Coder-32B-Instruct")
 # Create the agent
 agent = CodeAgent(
     tools=tools,
     model=model,
 )
 AGENT_INSTRUCTIONS = """You are a highly capable AI assistant. Your primary goal is to accomplish tasks using a variety of tools, prioritizing Hugging Face Spaces.
 Follow these steps:
 1.  **Understand the Request:** Carefully analyze the user's prompt. Identify the core task and any specific requirements or inputs.
 2.  **Check Predefined Tools:** Review your list of available tools. If a predefined tool can directly address the request, use it.
 5.  **Execute the Tool:** Call the tool (predefined, or dynamically created via `Tool.from_space()`) with the necessary arguments.
     * **File Inputs:** If the user uploads files, their paths will be available as global string variables: `input_image_path`, `input_audio_path`, `input_video_path`, `input_3d_model_path`, `input_file_path`. Before using these variables, check if they exist and are not None. Pass these file paths as arguments to tools that require them.
     * **Imports in Generated Code:** If your code block for execution uses modules like `os` or `uuid`, **you must include the import statements (e.g., `import os`, `import uuid`) within that specific code block.**
+6.  **Output Management & Concluding a Step:**
+    * When your code block for a step is complete and has a result (e.g., a text string, a filepath from a tool), use the `return` statement (e.g., `return my_result_variable`).
+    * The system will use this returned value. You might see "ReturnException" in system logs; this is a normal part of a successful `return` and not an error you need to act upon. Based on the returned value, decide on your next action or if the task is complete.
+    * **If the entire user request is satisfied by the value you are returning, that `return` statement concludes your work for the current task.** You do not need to call `final_answer()` yourself; the system handles this based on your `return`.
 7.  **Clarity and Error Handling:** If you encounter issues (e.g., a Space tool fails, required inputs are missing), clearly explain the problem in your response. If a Space doesn't work, try to explain why or suggest an alternative if possible.
 Example of the **CORRECT AND PREFERRED** way to use a discovered Space:
 ```python
 # User prompt: "Find a space that can make an image of a cat and use it."
 #     # Now use the newly created tool. Arguments depend on the Space's API.
 #     # Let's assume it takes a 'prompt'.
 #     image_filepath = cat_tool(prompt="A fluffy siamese cat, cyberpunk style")
+#     return image_filepath # Return the filepath directly. This is the final result for this task.
 # except Exception as e:
 #     print(f"Failed to create or use tool from Space 'someuser/cat-image-generator' using Tool.from_space(): {e}")
 #     # If Tool.from_space() fails, DO NOT immediately try gradio_client.Client().
 #     # Instead, consider another space or a predefined tool.
 #     # return "Could not use the discovered space via Tool.from_space(). Trying a fallback..." (then try another step)
 ```
 Example of using a predefined tool that returns a filepath:
 ```python
 # User prompt: "Generate an image of a happy robot."
 # (Assuming 'image_generator_flux_schnell' is a predefined tool)
 #
 # image_filepath = image_generator_flux_schnell(prompt="A happy robot coding on a laptop, cyberpunk style")
+# return image_filepath # Return the filepath string directly. This is the final result for this task.
 ```
 Always ensure your generated Python code is complete and directly callable.
 You have access to `PIL.Image` (as `Image`), `os`, `sys`, `numpy`, `huggingface_hub`, `gradio_client`, `uuid`. Remember to import them if you use them in a code block.
 # Gradio interface function
 def gradio_interface(user_prompt, input_image_path, input_audio_path, input_video_path, input_3d_model_path, input_file_path, progress=gr.Progress(track_tqdm=True)):
     try:
+        progress(0, desc="Initializing...") # Step 0
+        print("Progress: 0% - Initializing...")
         full_prompt_with_instructions = f"{AGENT_INSTRUCTIONS}\n\nUSER PROMPT: {user_prompt}"
         dynamic_globals_for_run = {}
                 print(f"Restored agent.python_interpreter.globals.")
             else:
                 print("Warning: Could not restore python_interpreter globals.")
         outputs = {
             "image": gr.update(value=None, visible=False), "file": gr.update(value=None, visible=False),
             "path": gr.update(value=None, visible=False), "audio": gr.update(value=None, visible=False),
         elif result is None: outputs["text"] = gr.update(value="Agent returned no result (None).", visible=True)
         else: outputs["text"] = gr.update(value=f"Unexpected result type: {type(result)}. Content: {str(result)}", visible=True)
+        progress(1, desc="Done!") # Step 3: All processing finished
+        print("Progress: 100% - Done!")
         return (outputs["image"], outputs["file"], outputs["path"], outputs["audio"], outputs["model3d"], outputs["text"])
     except Exception as e:
         error_msg = f"An error occurred: {str(e)}"
         print(error_msg)
         traceback.print_exc()
+        progress(1, desc="Error occurred.") # Ensure progress completes on error
         return (None, None, None, None, None, gr.update(value=error_msg, visible=True))
 # Create the Gradio app
         prompt_input = gr.Textbox(label="Enter your prompt", placeholder="e.g., 'Generate an image of a futuristic city'", lines=3, elem_id="user_prompt_textbox")
     with gr.Accordion("Optional File Inputs", open=False):
+        # Using gr.Group for better visual separation of input groups
+        with gr.Group():
+            with gr.Row():
+                input_image = gr.Image(label="Image Input", type="filepath", sources=["upload", "clipboard"], elem_id="input_image_upload")
+                input_audio = gr.Audio(label="Audio Input", type="filepath", sources=["upload", "microphone"], elem_id="input_audio_upload")
+        with gr.Group():
+            with gr.Row():
+                input_video = gr.Video(label="Video Input", sources=["upload"], elem_id="input_video_upload")
+                input_model3d = gr.Model3D(label="3D Model Input", elem_id="input_model3d_upload")
+        with gr.Group():
+            with gr.Row():
+                input_file = gr.File(label="Generic File Input (PDF, TXT, etc.)", type="filepath", elem_id="input_file_upload")
     submit_button = gr.Button("🚀 Generate", variant="primary", elem_id="submit_button_generate")
         examples=[
             ["Generate an image of a happy robot coding on a laptop, cyberpunk style.", None, None, None, None, None],
             ["Convert the following text to speech: 'Smolagents are amazing for building AI applications.'", None, None, None, None, None],
+            ["Search for a Hugging Face Space that can perform image captioning. Describe the Caption the following image.", "Happy Robot Coding.webp", None, None, None, None],
             ["I have an image of a robot. Make this image Ghibli style.", "Happy Robot Coding.webp", None, None, None, None],
             ["Generate an EDM jazz song about a futuristic city.", None, None, None, None, None],
+            ["Generate audio of a dog barking.", None, None, None, None, None],
         ],
         inputs=[prompt_input, input_image, input_audio, input_video, input_model3d, input_file],
         label="Example Prompts (Note: For examples with file inputs, you'll need to upload a relevant file first or ensure the named file exists in the Space's root)"