Spaces:
Build error
Build error
| cnt_critic_agents: 3 | |
| max_loop_rounds: 2 | |
| max_criticizing_rounds: 3 | |
| human_eval: false | |
| task_description: |- | |
| build a web application of a simple calculator. | |
| evaluation_dimensions: |- | |
| prompts: | |
| role_assigner_prompt: &role_assigner_prompt |- | |
| You are the leader of a group of experts, now you are faced with a task: | |
| ${task_description} | |
| You can recruit ${cnt_critic_agents} expert team members in different regions. | |
| What experts will you recruit to better generate good ideas? | |
| Output format example: | |
| 1. an electrical engineer specified in the filed of xxx | |
| 2. an economist who is good at xxx | |
| 3. a lawyer with a good knowledge of xxx | |
| ... | |
| ${advice} | |
| You don't have to give the reason. | |
| solver_prompt: &solver_prompt |- | |
| You are faced with the task: | |
| ${task_description} | |
| You formerly gave the following solution | |
| ${former_solution} | |
| But other critics in the group are not satisfied with it. | |
| They gave the following opinions: | |
| ${critic_opinions} | |
| Now you are going to give a new solution, | |
| based upon your former solution and the critics' opinions. | |
| Please write code only! | |
| If no former solution or critic opinions are given, you can ignore this part, and just generate a new solution. | |
| summarizer_prompt: &summarizer_prompt |- | |
| You are a summarizer. | |
| Your task is to categorize and summarize the ideas in the chat history. | |
| Please add the speaker of each idea to the beginning of the content. | |
| The question of the discussing is to ${task_description}. | |
| #Output format | |
| 1. (Speaker1): (Ideas of Speaker 1 in a single line) | |
| 2. (Speaker2): (Ideas of Speaker 2 in a single line) | |
| 3. (Speaker3): (Ideas of Speaker 3 in a single line) | |
| ... | |
| Here is the content you have to summarize: | |
| ${former_solution} | |
| ${critic_opinions} | |
| Please merge all ideas of one speaker into one item. | |
| critic_prompt: &critic_prompt |- | |
| Now you are ${role_description} | |
| You are in a discussion group, aiming to ${task_description}. | |
| Now the group is going to give a preliminary solution as the following: | |
| ${preliminary_solution} | |
| Now the group is asking your opinion about it. Based on your knowledge | |
| in your field, do you agree that this solution can perfectly | |
| solve the problem? | |
| Or do you have any ideas to improve it? | |
| - If you thinks it is perfect, use the following output format: | |
| Action: Agree | |
| Action Input: Agree. | |
| (Do not output your reason for agreeing!) | |
| - If you want to give complemented opinions to improve it or to contradict with it, use the following output format: | |
| Action: Disagree | |
| Action Input: (what you want to say in one line) | |
| P.S. Always remember you are ${role_description}! | |
| ${advice} | |
| If no former solution or critic opinions are given, you can just disagree and output your idea freely, based on the expertise of your role. | |
| Remember, the ideas should be specific and detailed enough, not just general opinions. | |
| Please control output code in 2048 tokens! (Write concise code) | |
| evaluator_prompt: &evaluator_prompt |- | |
| You are an professional code reviewer. | |
| Your task is to evaluate the solution. | |
| The code is to ${task_description}. | |
| Your task is to evaluate the codes written by the code engineers. | |
| Please not only give a general rating points (from 0 to 10) but also give detailed comments about where and how the code can be improved. | |
| Please consider the following aspects when you are evaluating the code: | |
| 1. The code should be able to run without any errors. | |
| 2. The code should be able to achieve the goal specified in the task description. | |
| 3. The code should be easy to read and understand, efficient, concise and elegant. | |
| 4. The code should be robust. | |
| Please rate the code in the following dimensions: | |
| 1. Completeness: Is the code snippet complete enough without unimplemented functions of methods? Is it able to run without any errors? | |
| 2. Functionality: Is the code able to achieve the goal specified in the task description? | |
| 3. Readability: Is the code easy to read and understand, efficient, concise and elegant? | |
| 4. Robustness: Is the code snippet able to handle different unexpected input or other exceptions? | |
| 0 means the idea looks like beginner's work, and 10 means the idea is perfect in that aspect, like a master. | |
| and then in the fifth line of output, give your detailed advice for the engineers to better generate good codes. | |
| #Output format | |
| You must output in the following format: | |
| 1. Completeness: (a score between 0 and 9) | |
| 2. Functionality: (a score between 0 and 9) | |
| 3. Readability: (a score between 0 and 9) | |
| 4. Robustness: (a score between 0 and 9) | |
| 5. Advice: (your advice in one line) | |
| Here is the content you have to evaluate: | |
| ${solution} | |
| name: pipeline | |
| environment: | |
| env_type: task-basic | |
| max_loop_rounds: 3 | |
| rule: | |
| order: | |
| type: sequential | |
| visibility: | |
| type: all | |
| selector: | |
| type: basic | |
| updater: | |
| type: basic | |
| describer: | |
| type: basic | |
| agents: | |
| - #role_assigner_agent: | |
| agent_type: role_assigner | |
| name: role assigner | |
| prompt_template: | |
| memory: | |
| memory_type: chat_history | |
| llm: | |
| llm_type: gpt-4 | |
| model: "gpt-4" | |
| temperature: 0 | |
| max_tokens: 256 | |
| output_parser: | |
| type: role_assigner | |
| - #solver_agent: | |
| agent_type: solver | |
| name: Planner | |
| prompt_template: [*solver_prompt, *summarizer_prompt] | |
| memory: | |
| memory_type: chat_history | |
| llm: | |
| llm_type: gpt-4 | |
| model: "gpt-4" | |
| temperature: 0 | |
| max_tokens: 2048 | |
| - #critic_agents: | |
| agent_type: critic | |
| name: Critic 1 | |
| role_description: |- | |
| Waiting to be assigned. | |
| prompt_template: | |
| memory: | |
| memory_type: chat_history | |
| llm: | |
| llm_type: gpt-4 | |
| model: "gpt-4" | |
| temperature: 0 | |
| max_tokens: 256 | |
| output_parser: | |
| type: critic | |
| - #executor_agent: | |
| agent_type: executor | |
| name: Executor | |
| prompt_template: None | |
| memory: | |
| memory_type: chat_history | |
| llm: | |
| llm_type: gpt-4 | |
| model: "gpt-4" | |
| temperature: 0 | |
| max_tokens: 512 | |
| - #evaluator_agent: | |
| agent_type: evaluator | |
| name: Evaluator | |
| role_description: |- | |
| Code Reviewer | |
| prompt_template: | |
| memory: | |
| memory_type: chat_history | |
| llm: | |
| llm_type: gpt-4 | |
| model: "gpt-4" | |
| temperature: 0 | |
| max_tokens: 128 | |
| output_parser: | |
| type: evaluator | |
| dimensions: | |
| - Completeness | |
| - Functionality | |
| - Readability | |
| - Robustness | |
| tools: | |