simoncwang's picture
changed some tools and prompting
84013cf
import yaml
import os
from smolagents import PromptTemplates, PlanningPromptTemplate, ManagedAgentPromptTemplate, FinalAnswerPromptTemplate
import re
output_prompt = """
Your final answer should be a number OR as few words as possible OR a comma separated list of numbers and/or strings.
If you are asked for a number, don't use comma to write your number neither use units such as $ or percent sign unless specified otherwise, use digits only.
If you are asked for a string, don't use articles, abbreviations (e.g. for cities), and write the digits in plain text unless specified otherwise.
If you are asked for a comma separated list, apply the above rules depending of whether the element to be put in the list is a number or a string.
If you are returning the final answer, ABSOLUTELY RETURN ONLY A SINGLE STRING CONTAINING THE ANSWER WITH NO REASONING OR DIFFERENT LEVELS OF DETAIL.
DO NOT use any abbreviations, acronyms, or slang in your answers. If you find an abbreviation or acronym in the question, expand it to its full form.
"""
system_prompt = (
"You are a reasoning agent that must carefully follow answer format instructions "
"embedded in the question. Always return only the final answer as a plain string, "
"formatted exactly as instructed (e.g., no commas, round to nearest value, etc). "
"Do not include units, explanations, or extra words unless explicitly required."
"Do not include explanations, reasoning steps, or section headers. Just return the answer string only"
"Please think step by step and reason carefully to answer the following question:\n\n"
)
commander_prompt = (
"""
You are the top commander agent of a team of genius agents. You will receive a complex question that requires the help of your team to answer.
Your job is to analyze the question, decide which agents are needed, and delegate tasks to them.
You can use the scout agent to analyze images, videos, and audio, and the hacker agent to search the web and calculate mathematical expressions.
If the question requires reasoning or the other agents don't return meaningful results, you can use the thinker agent to reason about the question and provide insights.
You will receive the final answers from your team and must combine them into a single, concise answer
that ABSOLUTELY follows the answer format instructions embedded in the question. Be careful to NOT use any abbreviations, acronyms, or extra words in the final answer.
"""
)
# creating custom commander prompt templates
# getting commander prompts from yaml file
commander_prompts_path = os.path.join(os.path.dirname(__file__), "commander_agent.yaml")
with open(commander_prompts_path, 'r') as file:
commander_prompts_dict = yaml.safe_load(file)
# replacing custom instructions with commander prompt using regex
commander_prompts_dict["system_prompt"] = re.sub(
r"{%-?\s*if\s+custom_instructions\s*%}.*?{%-?\s*endif\s*%}",
commander_prompt, # Replace with custom commander prompt
commander_prompts_dict["system_prompt"],
flags=re.DOTALL,
)
# creating the prompt templates object
commander_prompt_templates = PromptTemplates(
{
"system_prompt": commander_prompts_dict["system_prompt"],
"planning": PlanningPromptTemplate(
commander_prompts_dict["planning"],
),
"managed_agent": ManagedAgentPromptTemplate(
commander_prompts_dict["managed_agent"],
),
"final_answer": FinalAnswerPromptTemplate(
commander_prompts_dict["final_answer"],
)
}
)
SCOUT_DESCRIPTION = """
This is a scout agent that can analyze images, videos, and audio to give crucial insights. DO NOT use any abbreviations, acronyms, or slang in your answers.
It has access to to the following tools:
- `image_analyzer`: Analyze an image from a URL and answer questions about it.
- `youtube_transcript`: Fetch the transcript of a YouTube video including timestamps.
- `transcriber`: Transcribe audio from given URL, path, or tensor, to text.
"""
HACKER_DESCRIPTION = """
This is a hacker agent that can search the web, calculate mathematical expressions, and provide final answers. DO NOT use any abbreviations, acronyms, or slang in your answers.
It has access to the following tools:
- `wikipedia_search`: Searches Wikipedia and returns a summary or full text of the query results.
- `visit_webpage`: Visit a webpage and extract information as a markdown.
- `multiply`: Multiply two numbers.
- `divide`: Divide two numbers.
- `add`: Add two numbers.
- `subtract`: Subtract two numbers.
- `modulus`: Calculate the modulus of two numbers.
- `final_answer`: Provide the final answer to the question.
"""
THINKER_DESCRIPTION = """
This is a thinker agent that can think at a more logical level about the question and information provided. The thinker reasons about questions and is good when other code agents have trouble with executing the task, or as a reference to compare and check if code results make sense. Think step by step and reason carefully to answer the question. DONT try to run any code, just reason about the question and provide insights. DO NOT use any abbreviations, acronyms, or slang in your answers.
It has access to the following tool:
- `final_answer`: Provide the final answer to the question.
"""
COMMANDER_DESCRIPTION = """
This is a commander agent that coordinates the work of other agents to answer questions. It decides which agents are best suited for the task at hand and delegates tasks accordingly. DO NOT use any abbreviations, acronyms, or slang in your answers.
It manages the following team of agents:
- `hacker`: A hacker that can search the web, calculate, and provide final answers.
- `scout`: A scout that analyzes images, video, and audio to give crucial insights
- `thinker`: A thinker agent that can reason about the question and provide insights. USE THIS AGENT IF THE OTHER AGENTS DON'T RETURN MEANINGFUL RESULTS.
It has access to the following tools:
- `python_interpreter`: Execute Python code to perform calculations or data processing.
- `final_answer`: Provide the final answer to the question.
"""
# for testing
if __name__ == "__main__":
print("System Prompt:")
print(commander_prompts_dict["system_prompt"])
print("\nPlanning Prompt:")
print(commander_prompts_dict["planning"])
print("\nManaged Agent Prompt Template:")
print(commander_prompts_dict["managed_agent"])
print("\nFinal Answer Prompt Template:")
print(commander_prompts_dict["final_answer"])