Spaces:

mmmay0722
/

WebQA-Agent

Running

mmmay0722 commited on Sep 3

Commit

ef3f08c

1 Parent(s): 51a1a1d

feat: enhance test case management and update README

- Add automatic case indexing (1: CASE_NAME format) to fix duplicate names
- Add HuggingFace Spaces and community links to README

Files changed (6) hide show

README.md +12 -5
README_zh-CN.md +12 -5
app_gradio/demo_gradio.py +30 -5
app_gradio/gradio_i18n.json +8 -6
docs/images/webqa.svg +0 -0
webqa_agent/testers/function_tester.py +11 -3

README.md CHANGED Viewed

@@ -1,7 +1,7 @@
-# WebQA Agent
 <!-- badges -->
-<p align="left">
   <a href="https://github.com/MigoXLab/webqa-agent/blob/main/LICENSE"><img src="https://img.shields.io/github/license/MigoXLab/webqa-agent" alt="License"></a>
   <a href="https://github.com/MigoXLab/webqa-agent/stargazers"><img src="https://img.shields.io/github/stars/MigoXLab/webqa-agent" alt="GitHub stars"></a>
   <a href="https://github.com/MigoXLab/webqa-agent/network/members"><img src="https://img.shields.io/github/forks/MigoXLab/webqa-agent" alt="GitHub forks"></a>
@@ -9,9 +9,14 @@
   <a href="https://deepwiki.com/MigoXLab/webqa-agent"><img src="https://deepwiki.com/badge.svg" alt="Ask DeepWiki"></a>
 </p>
-[English](README.md) · [简体中文](README_zh-CN.md)
-**WebQA Agent** is an autonomous web agent that audits performance, functionality, and UX for any web product.
 ## 🚀 Core Features
@@ -101,7 +106,9 @@ python webqa-agent.py
 ## Online Demo
-Experience online: [WebQA-Agent on ModelScope](https://modelscope.cn/studios/mmmmei22/WebQA-Agent/summary)
 ## Usage

+<h1 align="center">WebQA Agent</h1>
 <!-- badges -->
+<p align="center">
   <a href="https://github.com/MigoXLab/webqa-agent/blob/main/LICENSE"><img src="https://img.shields.io/github/license/MigoXLab/webqa-agent" alt="License"></a>
   <a href="https://github.com/MigoXLab/webqa-agent/stargazers"><img src="https://img.shields.io/github/stars/MigoXLab/webqa-agent" alt="GitHub stars"></a>
   <a href="https://github.com/MigoXLab/webqa-agent/network/members"><img src="https://img.shields.io/github/forks/MigoXLab/webqa-agent" alt="GitHub forks"></a>
   <a href="https://deepwiki.com/MigoXLab/webqa-agent"><img src="https://deepwiki.com/badge.svg" alt="Ask DeepWiki"></a>
 </p>
+<p align="center">
+  Try Demo 🤗<a href="https://huggingface.co/spaces/mmmay0722/WebQA-Agent">HuggingFace</a> | 🚀<a href="https://modelscope.cn/studios/mmmmei22/WebQA-Agent/summary">ModelScope</a><br>
+  Join us on 🎮<a href="https://discord.gg/K5TtkVcx">Discord</a> | 💬<a href="https://aicarrier.feishu.cn/docx/NRNXdIirXoSQEHxhaqjchUfenzd">WeChat</a>
+</p>
+<p align="center"><a href="README.md">English</a> · <a href="README_zh-CN.md">简体中文</a></p>
+<p align="center">🤖 <strong>WebQA Agent</strong> is an autonomous web browser agent that audits performance, functionality & UX for engineers and vibe-coding creators. ✨</p>
 ## 🚀 Core Features
 ## Online Demo
+🚀 **Try WebQA Agent Online:**
+- **Hugging Face Spaces**: [WebQA-Agent on Hugging Face](https://huggingface.co/spaces/mmmay0722/WebQA-Agent)
+- **ModelScope Studio**: [WebQA-Agent on ModelScope](https://modelscope.cn/studios/mmmmei22/WebQA-Agent/summary)
 ## Usage

README_zh-CN.md CHANGED Viewed

@@ -1,7 +1,7 @@
-# WebQA Agent
 <!-- badges -->
-<p align="left">
   <a href="https://github.com/MigoXLab/webqa-agent/blob/main/LICENSE"><img src="https://img.shields.io/github/license/MigoXLab/webqa-agent" alt="License"></a>
   <a href="https://github.com/MigoXLab/webqa-agent/stargazers"><img src="https://img.shields.io/github/stars/MigoXLab/webqa-agent" alt="GitHub stars"></a>
   <a href="https://github.com/MigoXLab/webqa-agent/network/members"><img src="https://img.shields.io/github/forks/MigoXLab/webqa-agent" alt="GitHub forks"></a>
@@ -9,9 +9,14 @@
   <a href="https://deepwiki.com/MigoXLab/webqa-agent"><img src="https://deepwiki.com/badge.svg" alt="Ask DeepWiki"></a>
 </p>
-[English](README.md) · [简体中文](README_zh-CN.md)
-**WebQA Agent** 是全自动网页评估测试 Agent，一键诊断性能、安全、功能与交互体验
 ## 🚀 核心特性
@@ -104,7 +109,9 @@ python webqa-agent.py
 ## 在线演示
-进入ModelScope体验：[WebQA-Agent on ModelScope](https://modelscope.cn/studios/mmmmei22/WebQA-Agent/summary)
 ## 使用说明

+<h1 align="center">WebQA Agent</h1>
 <!-- badges -->
+<p align="center">
   <a href="https://github.com/MigoXLab/webqa-agent/blob/main/LICENSE"><img src="https://img.shields.io/github/license/MigoXLab/webqa-agent" alt="License"></a>
   <a href="https://github.com/MigoXLab/webqa-agent/stargazers"><img src="https://img.shields.io/github/stars/MigoXLab/webqa-agent" alt="GitHub stars"></a>
   <a href="https://github.com/MigoXLab/webqa-agent/network/members"><img src="https://img.shields.io/github/forks/MigoXLab/webqa-agent" alt="GitHub forks"></a>
   <a href="https://deepwiki.com/MigoXLab/webqa-agent"><img src="https://deepwiki.com/badge.svg" alt="Ask DeepWiki"></a>
 </p>
+<p align="center">
+  体验Demo 🤗<a href="https://huggingface.co/spaces/mmmay0722/WebQA-Agent">HuggingFace</a> | 🚀<a href="https://modelscope.cn/studios/mmmmei22/WebQA-Agent/summary">ModelScope</a><br>
+  加入我们 🎮<a href="https://discord.gg/K5TtkVcx">Discord</a> | 💬<a href="https://aicarrier.feishu.cn/docx/NRNXdIirXoSQEHxhaqjchUfenzd">微信群</a>
+</p>
+<p align="center"><a href="README.md">English</a> · <a href="README_zh-CN.md">简体中文</a></p>
+<p align="center">🤖 <strong>WebQA Agent</strong> 是全自动网页评估测试 Agent，一键完成性能、功能与交互体验的测试评估 ✨</p>
 ## 🚀 核心特性
 ## 在线演示
+🚀 **在线体验 WebQA Agent:**
+- **Hugging Face Spaces**: [WebQA-Agent on Hugging Face](https://huggingface.co/spaces/mmmay0722/WebQA-Agent)
+- **ModelScope Studio**: [WebQA-Agent on ModelScope](https://modelscope.cn/studios/mmmmei22/WebQA-Agent/summary)
 ## 使用说明

app_gradio/demo_gradio.py CHANGED Viewed

@@ -157,6 +157,16 @@ def create_config_dict(
     report_language: str = "zh-CN"
 ) -> Dict[str, Any]:
     """Create configuration dictionary"""
     config = {
         "target": {
             "url": url,
@@ -166,7 +176,7 @@ def create_config_dict(
             "function_test": {
                 "enabled": function_test_enabled,
                 "type": function_test_type,
-                "business_objectives": business_objectives
             },
             "ux_test": {
                 "enabled": ux_test_enabled
@@ -321,10 +331,6 @@ def submit_test(
     if not any([function_test_enabled, ux_test_enabled, performance_test_enabled, security_test_enabled]):
         return get_text(interface_language, "messages.error_no_tests"), "", False
-    # If function test is enabled but no business objectives set
-    if function_test_enabled and function_test_type == "ai" and not business_objectives.strip():
-        return get_text(interface_language, "messages.error_no_business_objectives"), "", False
     # Validate LLM configuration
     valid, msg = validate_llm_config(api_key, base_url, model, interface_language)
     if not valid:
@@ -357,6 +363,7 @@ def submit_test(
         "tests": {
             "function": function_test_enabled,
             "function_type": function_test_type,
             "ux": ux_test_enabled,
         },
         "submitted_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
@@ -630,6 +637,15 @@ def create_gradio_interface(language: str = "zh-CN"):
     .fixed-width-table td:nth-child(6),
     .content-wrapper .gradio-dataframe th:nth-child(6),
     .content-wrapper .gradio-dataframe td:nth-child(6) {
         width: auto !important;
         max-width: none !important;
         min-width: 70px !important;
@@ -866,12 +882,21 @@ def create_gradio_interface(language: str = "zh-CN"):
         def get_history_rows(lang):
             rows = []
             for item in reversed(submission_history[-100:]):
                 rows.append([
                     item["submitted_at"],
                     item["task_id"],
                     item["url"],
                     "✅" if item["tests"]["function"] else "-",
                     item["tests"]["function_type"],
                     "✅" if item["tests"]["ux"] else "-"
                 ])
             return rows

     report_language: str = "zh-CN"
 ) -> Dict[str, Any]:
     """Create configuration dictionary"""
+    final_business_objectives = business_objectives.strip()
+    default_constraint = get_text(report_language, "config.default_business_objectives")
+    if final_business_objectives:
+        separator = ","
+        final_business_objectives = f"{final_business_objectives}{separator}{default_constraint}"
+    else:
+        final_business_objectives = default_constraint
     config = {
         "target": {
             "url": url,
             "function_test": {
                 "enabled": function_test_enabled,
                 "type": function_test_type,
+                "business_objectives": final_business_objectives
             },
             "ux_test": {
                 "enabled": ux_test_enabled
     if not any([function_test_enabled, ux_test_enabled, performance_test_enabled, security_test_enabled]):
         return get_text(interface_language, "messages.error_no_tests"), "", False
     # Validate LLM configuration
     valid, msg = validate_llm_config(api_key, base_url, model, interface_language)
     if not valid:
         "tests": {
             "function": function_test_enabled,
             "function_type": function_test_type,
+            "business_objectives": business_objectives,
             "ux": ux_test_enabled,
         },
         "submitted_at": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
     .fixed-width-table td:nth-child(6),
     .content-wrapper .gradio-dataframe th:nth-child(6),
     .content-wrapper .gradio-dataframe td:nth-child(6) {
+        width: auto !important;
+        max-width: none !important;
+        min-width: 200px !important;
+        text-align: left !important;
+    }
+    .fixed-width-table th:nth-child(7),
+    .fixed-width-table td:nth-child(7),
+    .content-wrapper .gradio-dataframe th:nth-child(7),
+    .content-wrapper .gradio-dataframe td:nth-child(7) {
         width: auto !important;
         max-width: none !important;
         min-width: 70px !important;
         def get_history_rows(lang):
             rows = []
             for item in reversed(submission_history[-100:]):
+                business_objectives = item["tests"].get("business_objectives", "")
+                function_type = item["tests"]["function_type"]
+                if function_type == "ai" and business_objectives:
+                    business_display = business_objectives[:30] + "..." if len(business_objectives) > 30 else business_objectives
+                else:
+                    business_display = "-"
                 rows.append([
                     item["submitted_at"],
                     item["task_id"],
                     item["url"],
                     "✅" if item["tests"]["function"] else "-",
                     item["tests"]["function_type"],
+                    business_display,
                     "✅" if item["tests"]["ux"] else "-"
                 ])
             return rows

app_gradio/gradio_i18n.json CHANGED Viewed

@@ -29,8 +29,9 @@
       "function_test_type": "功能测试类型",
       "function_test_type_info": "default: 遍历测试，覆盖可点击元素和所有链接\n ai: 基于视觉模型的智能测试，能够模拟真实用户行为、理解业务上下文，验证网页功能。",
       "business_objectives": "AI功能测试业务目标",
-      "business_objectives_placeholder": "测试对话功能，生成2个用例",
-      "business_objectives_info": "ai: 定制不同场景，精准发现复杂功能问题",
       "ux_test": "用户体验测试",
       "performance_test": "性能测试",
       "performance_test_info": "目前在 ModelScope 版本不可用；请前往 GitHub 体验",
@@ -54,7 +55,7 @@
     },
     "history": {
       "title": "提交记录",
-      "headers": ["提交时间", "任务ID", "URL", "功能测试", "类型", "UX测试"],
       "refresh_btn": "🔄 刷新历史记录"
     },
     "messages": {
@@ -118,8 +119,9 @@
       "function_test_type": "Function Test Type",
       "function_test_type_info": "default: Traverse clickable elements & links.\n ai: Vision-model intelligent test simulating users & validating functionality.",
       "business_objectives": "AI Function Test Business Objectives",
-      "business_objectives_placeholder": "Test chat functionality, generate 2 test cases",
-      "business_objectives_info": "ai: Customize different scenarios, accurately find complex functional issues",
       "ux_test": "User Experience Test",
       "performance_test": "Performance Test",
       "performance_test_info": "Currently unavailable in HuggingFace version; please visit GitHub for experience",
@@ -143,7 +145,7 @@
     },
     "history": {
       "title": "Submission Records",
-      "headers": ["Submit Time", "Task ID", "URL", "Function Test", "Type", "UX Test"],
       "refresh_btn": "🔄 Refresh History"
     },
     "messages": {

       "function_test_type": "功能测试类型",
       "function_test_type_info": "default: 遍历测试，覆盖可点击元素和所有链接\n ai: 基于视觉模型的智能测试，能够模拟真实用户行为、理解业务上下文，验证网页功能。",
       "business_objectives": "AI功能测试业务目标",
+      "business_objectives_placeholder": "测试对话功能",
+      "business_objectives_info": "ai: 定制不同场景，精准发现复杂功能问题。留空将使用默认设置（生成1个测试用例，每个用例包含6个步骤以内）",
+      "default_business_objectives": "生成1个测试用例，每个用例包含6个步骤以内",
       "ux_test": "用户体验测试",
       "performance_test": "性能测试",
       "performance_test_info": "目前在 ModelScope 版本不可用；请前往 GitHub 体验",
     },
     "history": {
       "title": "提交记录",
+      "headers": ["提交时间", "任务ID", "URL", "功能测试", "类型", "业务目标", "UX测试"],
       "refresh_btn": "🔄 刷新历史记录"
     },
     "messages": {
       "function_test_type": "Function Test Type",
       "function_test_type_info": "default: Traverse clickable elements & links.\n ai: Vision-model intelligent test simulating users & validating functionality.",
       "business_objectives": "AI Function Test Business Objectives",
+      "business_objectives_placeholder": "Test chat functionality",
+      "business_objectives_info": "ai: Customize different scenarios, accurately find complex functional issues. Leave blank to use default settings (generate 1 test case with no more than 6 steps per case)",
+      "default_business_objectives": "Generate 1 test case with no more than 6 steps per case",
       "ux_test": "User Experience Test",
       "performance_test": "Performance Test",
       "performance_test_info": "Currently unavailable in HuggingFace version; please visit GitHub for experience",
     },
     "history": {
       "title": "Submission Records",
+      "headers": ["Submit Time", "Task ID", "URL", "Function Test", "Type", "Business Objectives", "UX Test"],
       "refresh_btn": "🔄 Refresh History"
     },
     "messages": {

docs/images/webqa.svg CHANGED Viewed

webqa_agent/testers/function_tester.py CHANGED Viewed

@@ -335,6 +335,7 @@ class UITester:
                     raise ValueError(f"Invalid JSON response: {str(je)}")
                 if not plan_json.get("actions"):
                     raise ValueError("No valid actions found in plan")
                 return plan_json
@@ -477,8 +478,14 @@ class UITester:
             )
             self.finish_case("interrupted", "Case was interrupted by new case start")
         self.current_case_data = {
-            "name": case_name,
             "start_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
             "case_info": case_data or {},
             "steps": [],
@@ -494,7 +501,7 @@ class UITester:
         }
         self.current_case_steps = []
         self.step_counter = 0  # Reset step counter
-        logging.debug(f"Started tracking case: {case_name} (step counter reset)")
     def add_step_data(self, step_data: Dict[str, Any], step_type: str = "action"):
         """Add step data to current case."""
@@ -547,6 +554,7 @@ class UITester:
             return
         case_name = self.current_case_data.get("name", "Unknown")
         steps_count = len(self.current_case_steps)
         # Get monitoring data
@@ -624,7 +632,7 @@ class UITester:
         total_steps = 0
         for i, case in enumerate(self.all_cases_data):
             case_steps = case.get("steps", [])
-            case_name = case.get("name", f"Case_{i}")
             total_steps += len(case_steps)
             logging.debug(
                 f"Report validation - Case '{case_name}': {len(case_steps)} steps, status: {case.get('status', 'unknown')}"

                     raise ValueError(f"Invalid JSON response: {str(je)}")
                 if not plan_json.get("actions"):
+                    logging.error(f"No valid actions found in plan: {test_plan}")
                     raise ValueError("No valid actions found in plan")
                 return plan_json
             )
             self.finish_case("interrupted", "Case was interrupted by new case start")
+        # Calculate case index (1-based)
+        case_index = len(self.all_cases_data) + 1
+        formatted_case_name = f"{case_index}: {case_name}"
         self.current_case_data = {
+            "name": formatted_case_name,
+            "original_name": case_name,  # Keep original name for reference
+            "case_index": case_index,
             "start_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
             "case_info": case_data or {},
             "steps": [],
         }
         self.current_case_steps = []
         self.step_counter = 0  # Reset step counter
+        logging.debug(f"Started tracking case: {formatted_case_name} (step counter reset)")
     def add_step_data(self, step_data: Dict[str, Any], step_type: str = "action"):
         """Add step data to current case."""
             return
         case_name = self.current_case_data.get("name", "Unknown")
+        original_name = self.current_case_data.get("original_name", case_name)
         steps_count = len(self.current_case_steps)
         # Get monitoring data
         total_steps = 0
         for i, case in enumerate(self.all_cases_data):
             case_steps = case.get("steps", [])
+            case_name = case.get("name", f"Case_{i + 1}")  # Use 1-based indexing as fallback
             total_steps += len(case_steps)
             logging.debug(
                 f"Report validation - Case '{case_name}': {len(case_steps)} steps, status: {case.get('status', 'unknown')}"