brieuc.crosson
commited on
Commit
·
56ffdad
1
Parent(s):
913c2a3
feat: working on first agent
Browse files- MANIFEST.in +2 -0
- README.md +32 -3
- agent_piment_bleu/agent.py +539 -0
- agent_piment_bleu/logger.py +52 -7
- agent_piment_bleu/orchestrator.py +191 -137
- agent_piment_bleu/ui.py +188 -97
- dev_context/ROADMAP.md +27 -21
- examples/js_vuln/README.md +49 -0
- examples/js_vuln/app.js +123 -0
- examples/js_vuln/package.json +32 -0
- examples/js_vuln/utils.js +89 -0
- examples/js_vuln/views/index.handlebars +86 -0
- setup.py +4 -1
MANIFEST.in
ADDED
|
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Include the examples directory for testing
|
| 2 |
+
recursive-include examples *
|
README.md
CHANGED
|
@@ -10,6 +10,12 @@ AgentPimentBleu is an AI-powered agent designed to intelligently scan Git reposi
|
|
| 10 |
|
| 11 |
1. Detecting coding mistakes and configuration errors with AI-enhanced context.
|
| 12 |
2. Identifying vulnerable dependencies and, crucially, **assessing their actual impact** within the specific project's context, filtering out noise from irrelevant CVEs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
|
| 14 |
The goal is to provide developers with actionable, prioritized security insights, enabling them to focus on what truly matters.
|
| 15 |
|
|
@@ -20,9 +26,10 @@ This is the initial implementation of AgentPimentBleu, focusing on Phase 1 of th
|
|
| 20 |
- [x] Basic Gradio UI with repository URL input
|
| 21 |
- [x] Core functionality to clone and analyze Git repositories
|
| 22 |
- [x] LLM Integration with Ollama and Modal
|
| 23 |
-
- [
|
| 24 |
-
- [
|
| 25 |
-
- [
|
|
|
|
| 26 |
|
| 27 |
## Installation
|
| 28 |
|
|
@@ -61,6 +68,21 @@ This is the initial implementation of AgentPimentBleu, focusing on Phase 1 of th
|
|
| 61 |
|
| 62 |
4. The application will clone the repository and display the scan results.
|
| 63 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 64 |
### LLM Configuration
|
| 65 |
|
| 66 |
AgentPimentBleu uses a configuration file at `~/.config/agent_piment_bleu/llm_config.json` to store LLM provider settings. The default configuration will be created automatically on first run, but you can modify it to change the default provider or provider-specific settings:
|
|
@@ -125,6 +147,7 @@ This will create a result directory with the built package.
|
|
| 125 |
- `main.py`: Entry point for the application, re-exports main functions
|
| 126 |
- `ui.py`: Gradio UI implementation
|
| 127 |
- `orchestrator.py`: Main orchestrator that coordinates the scanning process
|
|
|
|
| 128 |
- `project_detector.py`: Detects programming languages used in the repository
|
| 129 |
- `reporting.py`: Generates formatted reports from scan results
|
| 130 |
- `llm/`: LLM integration modules
|
|
@@ -142,6 +165,12 @@ This will create a result directory with the built package.
|
|
| 142 |
- `sca.py`: Python SCA scanner using pip-audit
|
| 143 |
- `utils/`: Utility functions
|
| 144 |
- `git_utils.py`: Git repository handling functions
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 145 |
|
| 146 |
## Future Development
|
| 147 |
|
|
|
|
| 10 |
|
| 11 |
1. Detecting coding mistakes and configuration errors with AI-enhanced context.
|
| 12 |
2. Identifying vulnerable dependencies and, crucially, **assessing their actual impact** within the specific project's context, filtering out noise from irrelevant CVEs.
|
| 13 |
+
3. Exploring the codebase to understand how vulnerabilities might affect the specific project.
|
| 14 |
+
|
| 15 |
+
The agent follows a three-step process for each vulnerability:
|
| 16 |
+
1. Analyze the vulnerability details (CVE information)
|
| 17 |
+
2. Search for potential consequences in the codebase by exploring relevant files
|
| 18 |
+
3. Generate a comprehensive report with project-specific severity assessment
|
| 19 |
|
| 20 |
The goal is to provide developers with actionable, prioritized security insights, enabling them to focus on what truly matters.
|
| 21 |
|
|
|
|
| 26 |
- [x] Basic Gradio UI with repository URL input
|
| 27 |
- [x] Core functionality to clone and analyze Git repositories
|
| 28 |
- [x] LLM Integration with Ollama and Modal
|
| 29 |
+
- [x] SAST Integration with AI-enhanced analysis
|
| 30 |
+
- [x] SCA Integration with npm audit and pip-audit
|
| 31 |
+
- [x] AI-Powered Dependency Impact Assessment with codebase exploration
|
| 32 |
+
- [x] Intelligent agent that explores the codebase to assess vulnerability impact
|
| 33 |
|
| 34 |
## Installation
|
| 35 |
|
|
|
|
| 68 |
|
| 69 |
4. The application will clone the repository and display the scan results.
|
| 70 |
|
| 71 |
+
### Testing with Dummy Vulnerable Project
|
| 72 |
+
|
| 73 |
+
For testing purposes, AgentPimentBleu includes a dummy vulnerable JavaScript project:
|
| 74 |
+
|
| 75 |
+
1. Go to the "LLM Testing" tab in the UI.
|
| 76 |
+
2. Click the "Use Dummy Project" button at the bottom of the left column.
|
| 77 |
+
3. This will set the repository URL in the "Repository Scanner" tab to a special test URL.
|
| 78 |
+
4. Go back to the "Repository Scanner" tab and click "Scan Repository".
|
| 79 |
+
5. The application will use the local dummy project instead of cloning a repository.
|
| 80 |
+
|
| 81 |
+
This dummy project contains intentional vulnerabilities for testing the agent's analysis capabilities:
|
| 82 |
+
- Vulnerable dependencies in package.json
|
| 83 |
+
- Code with security issues (XSS, SSRF, command injection, etc.)
|
| 84 |
+
- Realistic project structure to test the agent's exploration capabilities
|
| 85 |
+
|
| 86 |
### LLM Configuration
|
| 87 |
|
| 88 |
AgentPimentBleu uses a configuration file at `~/.config/agent_piment_bleu/llm_config.json` to store LLM provider settings. The default configuration will be created automatically on first run, but you can modify it to change the default provider or provider-specific settings:
|
|
|
|
| 147 |
- `main.py`: Entry point for the application, re-exports main functions
|
| 148 |
- `ui.py`: Gradio UI implementation
|
| 149 |
- `orchestrator.py`: Main orchestrator that coordinates the scanning process
|
| 150 |
+
- `agent.py`: Intelligent agent for exploring codebases and analyzing vulnerabilities
|
| 151 |
- `project_detector.py`: Detects programming languages used in the repository
|
| 152 |
- `reporting.py`: Generates formatted reports from scan results
|
| 153 |
- `llm/`: LLM integration modules
|
|
|
|
| 165 |
- `sca.py`: Python SCA scanner using pip-audit
|
| 166 |
- `utils/`: Utility functions
|
| 167 |
- `git_utils.py`: Git repository handling functions
|
| 168 |
+
- `examples/`: Example projects for testing
|
| 169 |
+
- `js_vuln/`: Dummy vulnerable JavaScript project
|
| 170 |
+
- `app.js`: Main application file with intentional vulnerabilities
|
| 171 |
+
- `utils.js`: Utility functions with some vulnerable patterns
|
| 172 |
+
- `package.json`: Dependencies with known vulnerabilities
|
| 173 |
+
- `views/`: Directory containing view templates
|
| 174 |
|
| 175 |
## Future Development
|
| 176 |
|
agent_piment_bleu/agent.py
ADDED
|
@@ -0,0 +1,539 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Agent for exploring codebases and analyzing vulnerabilities
|
| 3 |
+
|
| 4 |
+
This module implements an agent that can explore a codebase to find where CVEs could be an issue.
|
| 5 |
+
The agent uses an LLM to analyze CVEs and explore the codebase to find potential vulnerabilities.
|
| 6 |
+
"""
|
| 7 |
+
|
| 8 |
+
import os
|
| 9 |
+
import subprocess
|
| 10 |
+
from typing import Dict, List, Any, Optional, Tuple
|
| 11 |
+
|
| 12 |
+
from agent_piment_bleu.llm.base import LLMProvider
|
| 13 |
+
from agent_piment_bleu.logger import get_logger
|
| 14 |
+
|
| 15 |
+
|
| 16 |
+
class SecurityAgent:
|
| 17 |
+
"""
|
| 18 |
+
Agent for exploring codebases and analyzing vulnerabilities.
|
| 19 |
+
|
| 20 |
+
This agent uses an LLM to analyze CVEs and explore the codebase to find potential vulnerabilities.
|
| 21 |
+
It follows a three-step process:
|
| 22 |
+
1. Analyze the CVE
|
| 23 |
+
2. Search potential consequences in the codebase (by opening different files)
|
| 24 |
+
3. Make a final report
|
| 25 |
+
"""
|
| 26 |
+
|
| 27 |
+
def __init__(self, llm: LLMProvider, repo_path: str):
|
| 28 |
+
"""
|
| 29 |
+
Initialize the security agent.
|
| 30 |
+
|
| 31 |
+
Args:
|
| 32 |
+
llm (LLMProvider): LLM provider to use for analysis
|
| 33 |
+
repo_path (str): Path to the repository to analyze
|
| 34 |
+
"""
|
| 35 |
+
self.llm = llm
|
| 36 |
+
self.repo_path = repo_path
|
| 37 |
+
self.logger = get_logger()
|
| 38 |
+
self.conversation_history = []
|
| 39 |
+
|
| 40 |
+
def get_project_structure(self) -> str:
|
| 41 |
+
"""
|
| 42 |
+
Get the structure of the project as a string (similar to tree command output).
|
| 43 |
+
|
| 44 |
+
Returns:
|
| 45 |
+
str: Project structure as a string
|
| 46 |
+
"""
|
| 47 |
+
try:
|
| 48 |
+
# Check if tree command is available
|
| 49 |
+
result = subprocess.run(
|
| 50 |
+
["which", "tree"],
|
| 51 |
+
capture_output=True,
|
| 52 |
+
text=True
|
| 53 |
+
)
|
| 54 |
+
|
| 55 |
+
if result.returncode == 0:
|
| 56 |
+
# Use tree command if available
|
| 57 |
+
tree_result = subprocess.run(
|
| 58 |
+
["tree", "-L", "3", self.repo_path],
|
| 59 |
+
capture_output=True,
|
| 60 |
+
text=True
|
| 61 |
+
)
|
| 62 |
+
return tree_result.stdout
|
| 63 |
+
else:
|
| 64 |
+
# Fallback to a simple directory listing
|
| 65 |
+
structure = []
|
| 66 |
+
|
| 67 |
+
for root, dirs, files in os.walk(self.repo_path):
|
| 68 |
+
# Limit depth to 3 levels
|
| 69 |
+
level = root.replace(self.repo_path, '').count(os.sep)
|
| 70 |
+
if level > 3:
|
| 71 |
+
continue
|
| 72 |
+
|
| 73 |
+
indent = ' ' * 4 * level
|
| 74 |
+
structure.append(f"{indent}{os.path.basename(root)}/")
|
| 75 |
+
|
| 76 |
+
sub_indent = ' ' * 4 * (level + 1)
|
| 77 |
+
for file in files:
|
| 78 |
+
structure.append(f"{sub_indent}{file}")
|
| 79 |
+
|
| 80 |
+
return '\n'.join(structure)
|
| 81 |
+
except Exception as e:
|
| 82 |
+
self.logger.error(f"Error getting project structure: {e}")
|
| 83 |
+
return f"Error getting project structure: {e}"
|
| 84 |
+
|
| 85 |
+
def read_file(self, file_path: str) -> str:
|
| 86 |
+
"""
|
| 87 |
+
Read the contents of a file.
|
| 88 |
+
|
| 89 |
+
Args:
|
| 90 |
+
file_path (str): Path to the file to read
|
| 91 |
+
|
| 92 |
+
Returns:
|
| 93 |
+
str: Contents of the file
|
| 94 |
+
"""
|
| 95 |
+
try:
|
| 96 |
+
# Make sure the file path is within the repository
|
| 97 |
+
full_path = os.path.join(self.repo_path, file_path)
|
| 98 |
+
if not os.path.abspath(full_path).startswith(os.path.abspath(self.repo_path)):
|
| 99 |
+
return f"Error: Attempted to access file outside repository: {file_path}"
|
| 100 |
+
|
| 101 |
+
if not os.path.isfile(full_path):
|
| 102 |
+
return f"Error: File not found: {file_path}"
|
| 103 |
+
|
| 104 |
+
with open(full_path, 'r', encoding='utf-8', errors='replace') as f:
|
| 105 |
+
return f.read()
|
| 106 |
+
except Exception as e:
|
| 107 |
+
self.logger.error(f"Error reading file {file_path}: {e}")
|
| 108 |
+
return f"Error reading file {file_path}: {e}"
|
| 109 |
+
|
| 110 |
+
def find_files(self, pattern: str) -> List[str]:
|
| 111 |
+
"""
|
| 112 |
+
Find files matching a pattern in the repository.
|
| 113 |
+
|
| 114 |
+
Args:
|
| 115 |
+
pattern (str): Pattern to search for
|
| 116 |
+
|
| 117 |
+
Returns:
|
| 118 |
+
List[str]: List of files matching the pattern
|
| 119 |
+
"""
|
| 120 |
+
try:
|
| 121 |
+
# Use find command to search for files
|
| 122 |
+
result = subprocess.run(
|
| 123 |
+
["find", self.repo_path, "-type", "f", "-name", pattern],
|
| 124 |
+
capture_output=True,
|
| 125 |
+
text=True
|
| 126 |
+
)
|
| 127 |
+
|
| 128 |
+
# Convert absolute paths to relative paths
|
| 129 |
+
files = []
|
| 130 |
+
for file in result.stdout.strip().split('\n'):
|
| 131 |
+
if file:
|
| 132 |
+
rel_path = os.path.relpath(file, self.repo_path)
|
| 133 |
+
files.append(rel_path)
|
| 134 |
+
|
| 135 |
+
return files
|
| 136 |
+
except Exception as e:
|
| 137 |
+
self.logger.error(f"Error finding files with pattern {pattern}: {e}")
|
| 138 |
+
return []
|
| 139 |
+
|
| 140 |
+
def search_in_files(self, search_term: str) -> Dict[str, List[str]]:
|
| 141 |
+
"""
|
| 142 |
+
Search for a term in all files in the repository.
|
| 143 |
+
|
| 144 |
+
Args:
|
| 145 |
+
search_term (str): Term to search for
|
| 146 |
+
|
| 147 |
+
Returns:
|
| 148 |
+
Dict[str, List[str]]: Dictionary mapping file paths to lists of matching lines
|
| 149 |
+
"""
|
| 150 |
+
try:
|
| 151 |
+
# Use grep to search for the term
|
| 152 |
+
result = subprocess.run(
|
| 153 |
+
["grep", "-r", "--include=*.*", search_term, self.repo_path],
|
| 154 |
+
capture_output=True,
|
| 155 |
+
text=True
|
| 156 |
+
)
|
| 157 |
+
|
| 158 |
+
# Parse the results
|
| 159 |
+
matches = {}
|
| 160 |
+
for line in result.stdout.strip().split('\n'):
|
| 161 |
+
if line:
|
| 162 |
+
parts = line.split(':', 1)
|
| 163 |
+
if len(parts) >= 2:
|
| 164 |
+
file_path = os.path.relpath(parts[0], self.repo_path)
|
| 165 |
+
content = parts[1]
|
| 166 |
+
|
| 167 |
+
if file_path not in matches:
|
| 168 |
+
matches[file_path] = []
|
| 169 |
+
|
| 170 |
+
matches[file_path].append(content.strip())
|
| 171 |
+
|
| 172 |
+
return matches
|
| 173 |
+
except Exception as e:
|
| 174 |
+
self.logger.error(f"Error searching for term {search_term}: {e}")
|
| 175 |
+
return {}
|
| 176 |
+
|
| 177 |
+
def analyze_vulnerability(self, vulnerability: Dict[str, Any]) -> Dict[str, Any]:
|
| 178 |
+
"""
|
| 179 |
+
Analyze a vulnerability using the agent.
|
| 180 |
+
|
| 181 |
+
This method implements the three-step process:
|
| 182 |
+
1. Analyze the CVE
|
| 183 |
+
2. Search potential consequences in the codebase
|
| 184 |
+
3. Make a final report
|
| 185 |
+
|
| 186 |
+
Args:
|
| 187 |
+
vulnerability (Dict[str, Any]): Vulnerability information
|
| 188 |
+
|
| 189 |
+
Returns:
|
| 190 |
+
Dict[str, Any]: Analysis results
|
| 191 |
+
"""
|
| 192 |
+
self.logger.info(f"Analyzing vulnerability: {vulnerability.get('cve', 'Unknown CVE')}")
|
| 193 |
+
|
| 194 |
+
# Reset conversation history
|
| 195 |
+
self.conversation_history = []
|
| 196 |
+
|
| 197 |
+
# Step 1: Analyze the CVE
|
| 198 |
+
cve_analysis = self._analyze_cve(vulnerability)
|
| 199 |
+
|
| 200 |
+
# Step 2: Search potential consequences in the codebase
|
| 201 |
+
codebase_analysis = self._explore_codebase(vulnerability, cve_analysis)
|
| 202 |
+
|
| 203 |
+
# Step 3: Make the final report
|
| 204 |
+
final_report = self._generate_final_report(vulnerability, cve_analysis, codebase_analysis)
|
| 205 |
+
|
| 206 |
+
# Update the vulnerability with the analysis results
|
| 207 |
+
vulnerability.update(final_report)
|
| 208 |
+
|
| 209 |
+
return vulnerability
|
| 210 |
+
|
| 211 |
+
def _analyze_cve(self, vulnerability: Dict[str, Any]) -> Dict[str, Any]:
|
| 212 |
+
"""
|
| 213 |
+
Analyze a CVE to understand its potential impact.
|
| 214 |
+
|
| 215 |
+
Args:
|
| 216 |
+
vulnerability (Dict[str, Any]): Vulnerability information
|
| 217 |
+
|
| 218 |
+
Returns:
|
| 219 |
+
Dict[str, Any]: CVE analysis results
|
| 220 |
+
"""
|
| 221 |
+
self.logger.info("Step 1: Analyzing CVE")
|
| 222 |
+
|
| 223 |
+
# Get vulnerability text
|
| 224 |
+
vulnerability_text = vulnerability.get('vulnerability_text', '')
|
| 225 |
+
if not vulnerability_text:
|
| 226 |
+
# Create a text representation if not already present
|
| 227 |
+
package_name = vulnerability.get('package', vulnerability.get('package_name', ''))
|
| 228 |
+
vulnerability_text = f"""
|
| 229 |
+
Package: {package_name}
|
| 230 |
+
Version: {vulnerability.get('version', 'unknown')}
|
| 231 |
+
Severity: {vulnerability.get('severity', 'medium')}
|
| 232 |
+
Title: {vulnerability.get('message', vulnerability.get('title', 'Unknown vulnerability'))}
|
| 233 |
+
CVE: {vulnerability.get('cve', 'N/A')}
|
| 234 |
+
"""
|
| 235 |
+
|
| 236 |
+
# Create prompt for CVE analysis
|
| 237 |
+
prompt = f"""
|
| 238 |
+
You are a security expert analyzing a vulnerability in a software dependency.
|
| 239 |
+
|
| 240 |
+
Vulnerability information:
|
| 241 |
+
{vulnerability_text}
|
| 242 |
+
|
| 243 |
+
Please analyze this vulnerability and provide the following information:
|
| 244 |
+
1. What is this vulnerability about? Explain in simple terms.
|
| 245 |
+
2. What are the potential consequences if this vulnerability is exploited?
|
| 246 |
+
3. What types of code patterns or usage might be vulnerable?
|
| 247 |
+
4. What should I look for in the codebase to determine if the project is affected?
|
| 248 |
+
|
| 249 |
+
Format your response in a clear, concise manner.
|
| 250 |
+
"""
|
| 251 |
+
|
| 252 |
+
# Get LLM analysis
|
| 253 |
+
response = self.llm.generate(prompt)
|
| 254 |
+
|
| 255 |
+
# Add to conversation history
|
| 256 |
+
self.conversation_history.append({
|
| 257 |
+
"role": "user",
|
| 258 |
+
"content": prompt
|
| 259 |
+
})
|
| 260 |
+
self.conversation_history.append({
|
| 261 |
+
"role": "assistant",
|
| 262 |
+
"content": response
|
| 263 |
+
})
|
| 264 |
+
|
| 265 |
+
# Return the analysis
|
| 266 |
+
return {
|
| 267 |
+
"cve_analysis": response
|
| 268 |
+
}
|
| 269 |
+
|
| 270 |
+
def _explore_codebase(self, vulnerability: Dict[str, Any], cve_analysis: Dict[str, Any]) -> Dict[str, Any]:
|
| 271 |
+
"""
|
| 272 |
+
Explore the codebase to find potential consequences of the vulnerability.
|
| 273 |
+
|
| 274 |
+
Args:
|
| 275 |
+
vulnerability (Dict[str, Any]): Vulnerability information
|
| 276 |
+
cve_analysis (Dict[str, Any]): Results of CVE analysis
|
| 277 |
+
|
| 278 |
+
Returns:
|
| 279 |
+
Dict[str, Any]: Codebase exploration results
|
| 280 |
+
"""
|
| 281 |
+
self.logger.info("Step 2: Exploring codebase for potential consequences")
|
| 282 |
+
|
| 283 |
+
# Get project structure
|
| 284 |
+
project_structure = self.get_project_structure()
|
| 285 |
+
|
| 286 |
+
# Get package name
|
| 287 |
+
package_name = vulnerability.get('package', vulnerability.get('package_name', ''))
|
| 288 |
+
|
| 289 |
+
# Create prompt for codebase exploration
|
| 290 |
+
prompt = f"""
|
| 291 |
+
You are a security expert analyzing a codebase to determine if it's affected by a vulnerability.
|
| 292 |
+
|
| 293 |
+
Vulnerability information:
|
| 294 |
+
{vulnerability.get('vulnerability_text', '')}
|
| 295 |
+
|
| 296 |
+
Your previous analysis of this vulnerability:
|
| 297 |
+
{cve_analysis.get('cve_analysis', '')}
|
| 298 |
+
|
| 299 |
+
Project structure:
|
| 300 |
+
```
|
| 301 |
+
{project_structure}
|
| 302 |
+
```
|
| 303 |
+
|
| 304 |
+
Based on the project structure and the vulnerability information, I need you to help me explore this codebase to determine if it's affected by the vulnerability.
|
| 305 |
+
|
| 306 |
+
Please suggest:
|
| 307 |
+
1. Files that might be using the vulnerable package ({package_name})
|
| 308 |
+
2. Search terms I should use to find relevant code
|
| 309 |
+
3. Specific patterns I should look for
|
| 310 |
+
|
| 311 |
+
I'll help you explore the codebase based on your suggestions.
|
| 312 |
+
"""
|
| 313 |
+
|
| 314 |
+
# Get LLM suggestions
|
| 315 |
+
response = self.llm.generate_with_context(prompt, self.conversation_history)
|
| 316 |
+
|
| 317 |
+
# Add to conversation history
|
| 318 |
+
self.conversation_history.append({
|
| 319 |
+
"role": "user",
|
| 320 |
+
"content": prompt
|
| 321 |
+
})
|
| 322 |
+
self.conversation_history.append({
|
| 323 |
+
"role": "assistant",
|
| 324 |
+
"content": response
|
| 325 |
+
})
|
| 326 |
+
|
| 327 |
+
# Now let's actually explore the codebase based on the suggestions
|
| 328 |
+
exploration_results = self._perform_exploration(response, package_name)
|
| 329 |
+
|
| 330 |
+
# Create a prompt with the exploration results
|
| 331 |
+
prompt = f"""
|
| 332 |
+
Based on your suggestions, I've explored the codebase. Here are the results:
|
| 333 |
+
|
| 334 |
+
{exploration_results}
|
| 335 |
+
|
| 336 |
+
Based on these findings, please analyze:
|
| 337 |
+
1. Is the project likely affected by the vulnerability?
|
| 338 |
+
2. What specific code patterns are concerning?
|
| 339 |
+
3. What would you recommend to fix the issue?
|
| 340 |
+
"""
|
| 341 |
+
|
| 342 |
+
# Get LLM analysis of exploration results
|
| 343 |
+
response = self.llm.generate_with_context(prompt, self.conversation_history)
|
| 344 |
+
|
| 345 |
+
# Add to conversation history
|
| 346 |
+
self.conversation_history.append({
|
| 347 |
+
"role": "user",
|
| 348 |
+
"content": prompt
|
| 349 |
+
})
|
| 350 |
+
self.conversation_history.append({
|
| 351 |
+
"role": "assistant",
|
| 352 |
+
"content": response
|
| 353 |
+
})
|
| 354 |
+
|
| 355 |
+
# Return the exploration results
|
| 356 |
+
return {
|
| 357 |
+
"exploration_results": exploration_results,
|
| 358 |
+
"exploration_analysis": response
|
| 359 |
+
}
|
| 360 |
+
|
| 361 |
+
def _perform_exploration(self, suggestions: str, package_name: str) -> str:
|
| 362 |
+
"""
|
| 363 |
+
Perform exploration of the codebase based on LLM suggestions.
|
| 364 |
+
|
| 365 |
+
Args:
|
| 366 |
+
suggestions (str): LLM suggestions for exploration
|
| 367 |
+
package_name (str): Name of the vulnerable package
|
| 368 |
+
|
| 369 |
+
Returns:
|
| 370 |
+
str: Results of the exploration
|
| 371 |
+
"""
|
| 372 |
+
results = []
|
| 373 |
+
|
| 374 |
+
# Search for the package name in all files
|
| 375 |
+
results.append(f"Searching for package '{package_name}' in all files:")
|
| 376 |
+
matches = self.search_in_files(package_name)
|
| 377 |
+
if matches:
|
| 378 |
+
for file_path, lines in matches.items():
|
| 379 |
+
results.append(f"\nFile: {file_path}")
|
| 380 |
+
for line in lines[:5]: # Limit to 5 lines per file
|
| 381 |
+
results.append(f" {line}")
|
| 382 |
+
if len(lines) > 5:
|
| 383 |
+
results.append(f" ... ({len(lines) - 5} more matches)")
|
| 384 |
+
else:
|
| 385 |
+
results.append(" No direct matches found.")
|
| 386 |
+
|
| 387 |
+
# Look for package.json or requirements.txt to check if the package is declared as a dependency
|
| 388 |
+
dependency_files = self.find_files("package.json") + self.find_files("requirements.txt")
|
| 389 |
+
if dependency_files:
|
| 390 |
+
results.append("\nChecking dependency files:")
|
| 391 |
+
for file_path in dependency_files:
|
| 392 |
+
results.append(f"\nFile: {file_path}")
|
| 393 |
+
content = self.read_file(file_path)
|
| 394 |
+
results.append(f"```\n{content[:1000]}{'...' if len(content) > 1000 else ''}\n```")
|
| 395 |
+
|
| 396 |
+
# Extract additional search terms from suggestions
|
| 397 |
+
import re
|
| 398 |
+
search_terms = re.findall(r'search for ["\']([^"\']+)["\']', suggestions, re.IGNORECASE)
|
| 399 |
+
search_terms += re.findall(r'search term[s]?:?\s*["\']([^"\']+)["\']', suggestions, re.IGNORECASE)
|
| 400 |
+
search_terms += re.findall(r'search for:?\s*["\']([^"\']+)["\']', suggestions, re.IGNORECASE)
|
| 401 |
+
search_terms += re.findall(r'look for ["\']([^"\']+)["\']', suggestions, re.IGNORECASE)
|
| 402 |
+
|
| 403 |
+
# Remove duplicates and the package name (already searched)
|
| 404 |
+
search_terms = list(set(search_terms))
|
| 405 |
+
if package_name in search_terms:
|
| 406 |
+
search_terms.remove(package_name)
|
| 407 |
+
|
| 408 |
+
# Search for additional terms
|
| 409 |
+
if search_terms:
|
| 410 |
+
results.append("\nSearching for additional terms suggested by the analysis:")
|
| 411 |
+
for term in search_terms[:3]: # Limit to 3 terms to avoid too much output
|
| 412 |
+
results.append(f"\nTerm: '{term}'")
|
| 413 |
+
matches = self.search_in_files(term)
|
| 414 |
+
if matches:
|
| 415 |
+
for file_path, lines in matches.items():
|
| 416 |
+
results.append(f"File: {file_path}")
|
| 417 |
+
for line in lines[:3]: # Limit to 3 lines per file
|
| 418 |
+
results.append(f" {line}")
|
| 419 |
+
if len(lines) > 3:
|
| 420 |
+
results.append(f" ... ({len(lines) - 3} more matches)")
|
| 421 |
+
else:
|
| 422 |
+
results.append(" No matches found.")
|
| 423 |
+
|
| 424 |
+
# Extract file patterns from suggestions
|
| 425 |
+
file_patterns = re.findall(r'files? (?:named|called|like) ["\']([^"\']+)["\']', suggestions, re.IGNORECASE)
|
| 426 |
+
file_patterns += re.findall(r'check (?:the )?file[s]? ["\']([^"\']+)["\']', suggestions, re.IGNORECASE)
|
| 427 |
+
|
| 428 |
+
# Search for specific files
|
| 429 |
+
if file_patterns:
|
| 430 |
+
results.append("\nSearching for specific files suggested by the analysis:")
|
| 431 |
+
for pattern in file_patterns[:3]: # Limit to 3 patterns
|
| 432 |
+
results.append(f"\nPattern: '{pattern}'")
|
| 433 |
+
files = self.find_files(f"*{pattern}*")
|
| 434 |
+
if files:
|
| 435 |
+
for file_path in files[:3]: # Limit to 3 files per pattern
|
| 436 |
+
results.append(f"File: {file_path}")
|
| 437 |
+
content = self.read_file(file_path)
|
| 438 |
+
results.append(f"```\n{content[:500]}{'...' if len(content) > 500 else ''}\n```")
|
| 439 |
+
if len(files) > 3:
|
| 440 |
+
results.append(f"... ({len(files) - 3} more files)")
|
| 441 |
+
else:
|
| 442 |
+
results.append(" No matching files found.")
|
| 443 |
+
|
| 444 |
+
return "\n".join(results)
|
| 445 |
+
|
| 446 |
+
def _generate_final_report(self, vulnerability: Dict[str, Any], cve_analysis: Dict[str, Any], codebase_analysis: Dict[str, Any]) -> Dict[str, Any]:
|
| 447 |
+
"""
|
| 448 |
+
Generate a final report based on the CVE analysis and codebase exploration.
|
| 449 |
+
|
| 450 |
+
Args:
|
| 451 |
+
vulnerability (Dict[str, Any]): Vulnerability information
|
| 452 |
+
cve_analysis (Dict[str, Any]): Results of CVE analysis
|
| 453 |
+
codebase_analysis (Dict[str, Any]): Results of codebase exploration
|
| 454 |
+
|
| 455 |
+
Returns:
|
| 456 |
+
Dict[str, Any]: Final report
|
| 457 |
+
"""
|
| 458 |
+
self.logger.info("Step 3: Generating final report")
|
| 459 |
+
|
| 460 |
+
# Create prompt for final report
|
| 461 |
+
prompt = f"""
|
| 462 |
+
Based on our analysis of the vulnerability and exploration of the codebase, please provide a final assessment with the following information:
|
| 463 |
+
|
| 464 |
+
1. PROJECT_SEVERITY: Assess the severity of this vulnerability for the project (critical, high, medium, low, or info).
|
| 465 |
+
2. IS_PROJECT_IMPACTED: Determine if the project is likely impacted by this vulnerability (true/false).
|
| 466 |
+
3. IMPACTED_CODE: Identify any code patterns that might be vulnerable.
|
| 467 |
+
4. PROPOSED_FIX: Suggest a specific fix for this vulnerability.
|
| 468 |
+
5. EXPLANATION: Provide a clear explanation of the vulnerability and its implications for this specific project.
|
| 469 |
+
|
| 470 |
+
Format your response as follows:
|
| 471 |
+
PROJECT_SEVERITY: [Your assessment]
|
| 472 |
+
IS_PROJECT_IMPACTED: [true/false]
|
| 473 |
+
IMPACTED_CODE: [Your assessment]
|
| 474 |
+
PROPOSED_FIX: [Your suggestion]
|
| 475 |
+
EXPLANATION: [Your explanation]
|
| 476 |
+
"""
|
| 477 |
+
|
| 478 |
+
# Get LLM final report
|
| 479 |
+
response = self.llm.generate_with_context(prompt, self.conversation_history)
|
| 480 |
+
|
| 481 |
+
# Add to conversation history
|
| 482 |
+
self.conversation_history.append({
|
| 483 |
+
"role": "user",
|
| 484 |
+
"content": prompt
|
| 485 |
+
})
|
| 486 |
+
self.conversation_history.append({
|
| 487 |
+
"role": "assistant",
|
| 488 |
+
"content": response
|
| 489 |
+
})
|
| 490 |
+
|
| 491 |
+
# Parse the response to extract the required fields
|
| 492 |
+
import re
|
| 493 |
+
|
| 494 |
+
project_severity = self._extract_field(response, "PROJECT_SEVERITY")
|
| 495 |
+
is_project_impacted = self._extract_field(response, "IS_PROJECT_IMPACTED")
|
| 496 |
+
impacted_code = self._extract_field(response, "IMPACTED_CODE")
|
| 497 |
+
proposed_fix = self._extract_field(response, "PROPOSED_FIX")
|
| 498 |
+
explanation = self._extract_field(response, "EXPLANATION")
|
| 499 |
+
|
| 500 |
+
# Convert is_project_impacted to boolean
|
| 501 |
+
is_project_impacted_bool = False
|
| 502 |
+
if is_project_impacted.lower() == "true":
|
| 503 |
+
is_project_impacted_bool = True
|
| 504 |
+
|
| 505 |
+
# Return the final report
|
| 506 |
+
return {
|
| 507 |
+
"project_severity": project_severity,
|
| 508 |
+
"is_project_impacted": is_project_impacted_bool,
|
| 509 |
+
"impacted_code": impacted_code,
|
| 510 |
+
"proposed_fix": proposed_fix,
|
| 511 |
+
"explanation": explanation,
|
| 512 |
+
"llm_analysis": {
|
| 513 |
+
"is_vulnerable": is_project_impacted_bool,
|
| 514 |
+
"confidence": "medium",
|
| 515 |
+
"impact": project_severity,
|
| 516 |
+
"explanation": explanation,
|
| 517 |
+
"remediation": proposed_fix,
|
| 518 |
+
"provider": self.llm.provider_name,
|
| 519 |
+
"model": self.llm.model_name
|
| 520 |
+
}
|
| 521 |
+
}
|
| 522 |
+
|
| 523 |
+
def _extract_field(self, text: str, field_name: str) -> str:
|
| 524 |
+
"""
|
| 525 |
+
Extract a field from the LLM response.
|
| 526 |
+
|
| 527 |
+
Args:
|
| 528 |
+
text (str): The LLM response text
|
| 529 |
+
field_name (str): The name of the field to extract
|
| 530 |
+
|
| 531 |
+
Returns:
|
| 532 |
+
str: The extracted field value, or a default message if not found
|
| 533 |
+
"""
|
| 534 |
+
import re
|
| 535 |
+
pattern = rf"{field_name}:\s*(.*?)(?:\n[A-Z_]+:|$)"
|
| 536 |
+
match = re.search(pattern, text, re.DOTALL)
|
| 537 |
+
if match:
|
| 538 |
+
return match.group(1).strip()
|
| 539 |
+
return f"No {field_name.lower()} provided."
|
agent_piment_bleu/logger.py
CHANGED
|
@@ -7,6 +7,8 @@ for managing a logging box in the UI.
|
|
| 7 |
|
| 8 |
from typing import List, Optional
|
| 9 |
import datetime
|
|
|
|
|
|
|
| 10 |
|
| 11 |
|
| 12 |
class Logger:
|
|
@@ -39,10 +41,25 @@ class Logger:
|
|
| 39 |
"""
|
| 40 |
self._ui_callback = callback
|
| 41 |
|
| 42 |
-
def _format_log(self, message: str, level: str) -> str:
|
| 43 |
-
"""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 44 |
timestamp = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
| 45 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 46 |
|
| 47 |
def _update_ui(self):
|
| 48 |
"""Update the UI logging box if a callback is set."""
|
|
@@ -50,30 +67,58 @@ class Logger:
|
|
| 50 |
log_content = "\n".join(self._logs)
|
| 51 |
self._ui_callback(log_content)
|
| 52 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 53 |
def info(self, message: str):
|
| 54 |
"""Log an informational message."""
|
| 55 |
-
|
|
|
|
| 56 |
self._logs.insert(0, log_entry)
|
| 57 |
self._update_ui()
|
| 58 |
return log_entry
|
| 59 |
|
| 60 |
def warning(self, message: str):
|
| 61 |
"""Log a warning message."""
|
| 62 |
-
|
|
|
|
| 63 |
self._logs.insert(0, log_entry)
|
| 64 |
self._update_ui()
|
| 65 |
return log_entry
|
| 66 |
|
| 67 |
def error(self, message: str):
|
| 68 |
"""Log an error message."""
|
| 69 |
-
|
|
|
|
| 70 |
self._logs.insert(0, log_entry)
|
| 71 |
self._update_ui()
|
| 72 |
return log_entry
|
| 73 |
|
| 74 |
def debug(self, message: str):
|
| 75 |
"""Log a debug message."""
|
| 76 |
-
|
|
|
|
| 77 |
self._logs.insert(0, log_entry)
|
| 78 |
self._update_ui()
|
| 79 |
return log_entry
|
|
|
|
| 7 |
|
| 8 |
from typing import List, Optional
|
| 9 |
import datetime
|
| 10 |
+
import inspect
|
| 11 |
+
import os
|
| 12 |
|
| 13 |
|
| 14 |
class Logger:
|
|
|
|
| 41 |
"""
|
| 42 |
self._ui_callback = callback
|
| 43 |
|
| 44 |
+
def _format_log(self, message: str, level: str, caller_info=None) -> str:
|
| 45 |
+
"""
|
| 46 |
+
Format a log message with timestamp, level, and caller information.
|
| 47 |
+
|
| 48 |
+
Args:
|
| 49 |
+
message (str): The log message
|
| 50 |
+
level (str): The log level (INFO, WARNING, ERROR, DEBUG)
|
| 51 |
+
caller_info (tuple, optional): Tuple containing (function_name, filename, line_number)
|
| 52 |
+
|
| 53 |
+
Returns:
|
| 54 |
+
str: Formatted log message
|
| 55 |
+
"""
|
| 56 |
timestamp = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
| 57 |
+
|
| 58 |
+
if caller_info:
|
| 59 |
+
function_name, filename, line_number = caller_info
|
| 60 |
+
return f"[{timestamp}] [{level}] [{function_name}] {message}"
|
| 61 |
+
else:
|
| 62 |
+
return f"[{timestamp}] [{level}] {message}"
|
| 63 |
|
| 64 |
def _update_ui(self):
|
| 65 |
"""Update the UI logging box if a callback is set."""
|
|
|
|
| 67 |
log_content = "\n".join(self._logs)
|
| 68 |
self._ui_callback(log_content)
|
| 69 |
|
| 70 |
+
def _get_caller_info(self, stack_level=2):
|
| 71 |
+
"""
|
| 72 |
+
Get information about the calling function.
|
| 73 |
+
|
| 74 |
+
Args:
|
| 75 |
+
stack_level (int): How many levels up the stack to look (2 is the caller of the logging method)
|
| 76 |
+
|
| 77 |
+
Returns:
|
| 78 |
+
tuple: (function_name, filename, line_number)
|
| 79 |
+
"""
|
| 80 |
+
frame = inspect.currentframe()
|
| 81 |
+
# Go up the stack to the caller of the logging method
|
| 82 |
+
for _ in range(stack_level):
|
| 83 |
+
if frame.f_back is not None:
|
| 84 |
+
frame = frame.f_back
|
| 85 |
+
else:
|
| 86 |
+
break
|
| 87 |
+
|
| 88 |
+
function_name = frame.f_code.co_name
|
| 89 |
+
filename = os.path.basename(frame.f_code.co_filename)
|
| 90 |
+
line_number = frame.f_lineno
|
| 91 |
+
|
| 92 |
+
return (function_name, filename, line_number)
|
| 93 |
+
|
| 94 |
def info(self, message: str):
|
| 95 |
"""Log an informational message."""
|
| 96 |
+
caller_info = self._get_caller_info()
|
| 97 |
+
log_entry = self._format_log(message, "INFO", caller_info)
|
| 98 |
self._logs.insert(0, log_entry)
|
| 99 |
self._update_ui()
|
| 100 |
return log_entry
|
| 101 |
|
| 102 |
def warning(self, message: str):
|
| 103 |
"""Log a warning message."""
|
| 104 |
+
caller_info = self._get_caller_info()
|
| 105 |
+
log_entry = self._format_log(message, "WARNING", caller_info)
|
| 106 |
self._logs.insert(0, log_entry)
|
| 107 |
self._update_ui()
|
| 108 |
return log_entry
|
| 109 |
|
| 110 |
def error(self, message: str):
|
| 111 |
"""Log an error message."""
|
| 112 |
+
caller_info = self._get_caller_info()
|
| 113 |
+
log_entry = self._format_log(message, "ERROR", caller_info)
|
| 114 |
self._logs.insert(0, log_entry)
|
| 115 |
self._update_ui()
|
| 116 |
return log_entry
|
| 117 |
|
| 118 |
def debug(self, message: str):
|
| 119 |
"""Log a debug message."""
|
| 120 |
+
caller_info = self._get_caller_info()
|
| 121 |
+
log_entry = self._format_log(message, "DEBUG", caller_info)
|
| 122 |
self._logs.insert(0, log_entry)
|
| 123 |
self._update_ui()
|
| 124 |
return log_entry
|
agent_piment_bleu/orchestrator.py
CHANGED
|
@@ -2,6 +2,8 @@ import os
|
|
| 2 |
import tempfile
|
| 3 |
import shutil
|
| 4 |
import importlib
|
|
|
|
|
|
|
| 5 |
from typing import Dict, Any, List, Optional
|
| 6 |
|
| 7 |
from agent_piment_bleu.utils.git_utils import clone_repository
|
|
@@ -9,6 +11,10 @@ from agent_piment_bleu.project_detector import detect_project_languages
|
|
| 9 |
from agent_piment_bleu.reporting import generate_markdown_report
|
| 10 |
from agent_piment_bleu.llm import create_llm_provider, get_llm_config
|
| 11 |
from agent_piment_bleu.logger import get_logger
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
|
| 13 |
def analyze_repository(repo_url, use_llm=True, llm_provider=None):
|
| 14 |
"""
|
|
@@ -31,13 +37,80 @@ def analyze_repository(repo_url, use_llm=True, llm_provider=None):
|
|
| 31 |
logger.info(f"Created temporary directory: {temp_dir}")
|
| 32 |
|
| 33 |
try:
|
| 34 |
-
#
|
| 35 |
-
|
| 36 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
|
| 42 |
# Detect languages used in the repository
|
| 43 |
logger.info("Detecting project languages")
|
|
@@ -72,7 +145,7 @@ def analyze_repository(repo_url, use_llm=True, llm_provider=None):
|
|
| 72 |
# Enhance SAST results with LLM if available
|
| 73 |
if llm and sast_result.get('success', False) and sast_result.get('findings', []):
|
| 74 |
logger.info(f"Enhancing SAST results with LLM for language: {language}")
|
| 75 |
-
sast_result = enhance_sast_with_llm(sast_result, llm, language)
|
| 76 |
scan_results.append(sast_result)
|
| 77 |
logger.info(f"SAST scan for {language} completed with {len(sast_result.get('findings', []))} findings")
|
| 78 |
|
|
@@ -208,7 +281,7 @@ def run_sca_scan(language, repo_path):
|
|
| 208 |
}
|
| 209 |
|
| 210 |
|
| 211 |
-
def enhance_sast_with_llm(sast_result: Dict[str, Any], llm, language: str) -> Dict[str, Any]:
|
| 212 |
"""
|
| 213 |
Enhance SAST results with LLM analysis.
|
| 214 |
|
|
@@ -216,11 +289,22 @@ def enhance_sast_with_llm(sast_result: Dict[str, Any], llm, language: str) -> Di
|
|
| 216 |
sast_result (Dict[str, Any]): Original SAST results
|
| 217 |
llm: LLM provider instance
|
| 218 |
language (str): Programming language
|
|
|
|
| 219 |
|
| 220 |
Returns:
|
| 221 |
Dict[str, Any]: Enhanced SAST results
|
| 222 |
"""
|
| 223 |
enhanced_findings = []
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 224 |
|
| 225 |
for finding in sast_result.get('findings', []):
|
| 226 |
# Skip if no code snippet is available
|
|
@@ -229,36 +313,82 @@ def enhance_sast_with_llm(sast_result: Dict[str, Any], llm, language: str) -> Di
|
|
| 229 |
continue
|
| 230 |
|
| 231 |
try:
|
| 232 |
-
#
|
| 233 |
-
|
| 234 |
-
|
| 235 |
-
|
| 236 |
-
|
| 237 |
-
|
| 238 |
-
|
| 239 |
-
|
| 240 |
-
|
| 241 |
-
|
| 242 |
-
|
| 243 |
-
|
| 244 |
-
|
| 245 |
-
|
| 246 |
-
|
|
|
|
|
|
|
| 247 |
|
| 248 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 249 |
except Exception as e:
|
| 250 |
-
|
| 251 |
# Keep the original finding if enhancement fails
|
| 252 |
enhanced_findings.append(finding)
|
| 253 |
|
| 254 |
# Update the findings in the result
|
| 255 |
sast_result['findings'] = enhanced_findings
|
| 256 |
sast_result['llm_enhanced'] = True
|
|
|
|
|
|
|
| 257 |
|
| 258 |
return sast_result
|
| 259 |
|
| 260 |
|
| 261 |
-
def enhance_sca_with_llm(sca_result: Dict[str, Any], llm, language: str, repo_path: str) -> Dict[str, Any]:
|
| 262 |
"""
|
| 263 |
Enhance SCA results with LLM analysis.
|
| 264 |
|
|
@@ -266,7 +396,7 @@ def enhance_sca_with_llm(sca_result: Dict[str, Any], llm, language: str, repo_pa
|
|
| 266 |
sca_result (Dict[str, Any]): Original SCA results
|
| 267 |
llm: LLM provider instance
|
| 268 |
language (str): Programming language
|
| 269 |
-
repo_path (str): Path to the repository
|
| 270 |
|
| 271 |
Returns:
|
| 272 |
Dict[str, Any]: Enhanced SCA results
|
|
@@ -274,20 +404,22 @@ def enhance_sca_with_llm(sca_result: Dict[str, Any], llm, language: str, repo_pa
|
|
| 274 |
enhanced_findings = []
|
| 275 |
logger = get_logger()
|
| 276 |
|
| 277 |
-
|
|
|
|
|
|
|
| 278 |
try:
|
| 279 |
-
|
| 280 |
-
|
| 281 |
-
|
| 282 |
-
|
| 283 |
-
dependency=package_name,
|
| 284 |
-
language=language
|
| 285 |
-
)
|
| 286 |
|
|
|
|
|
|
|
| 287 |
# Get vulnerability text for AI agent analysis
|
| 288 |
vulnerability_text = finding.get('vulnerability_text', '')
|
| 289 |
if not vulnerability_text:
|
| 290 |
# Create a text representation if not already present
|
|
|
|
| 291 |
vulnerability_text = f"""
|
| 292 |
Package: {package_name}
|
| 293 |
Version: {finding.get('version', 'unknown')}
|
|
@@ -295,70 +427,25 @@ Severity: {finding.get('severity', 'medium')}
|
|
| 295 |
Title: {finding.get('message', finding.get('title', 'Unknown vulnerability'))}
|
| 296 |
CVE: {finding.get('cve', 'N/A')}
|
| 297 |
"""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 298 |
|
| 299 |
-
# Prepare prompt for LLM analysis
|
| 300 |
-
prompt = f"""
|
| 301 |
-
Analyze the following security vulnerability in a {language} dependency:
|
| 302 |
-
|
| 303 |
-
{vulnerability_text}
|
| 304 |
-
|
| 305 |
-
Code snippets that might be using this dependency:
|
| 306 |
-
{code_snippets if code_snippets else "No specific code snippets found."}
|
| 307 |
-
|
| 308 |
-
Please provide the following information:
|
| 309 |
-
1. Project severity note: Assess the severity of this vulnerability for the project (critical, high, medium, low, or info).
|
| 310 |
-
2. Is project impacted: Determine if the project is likely impacted by this vulnerability (true/false).
|
| 311 |
-
3. Potentially impacted code: Identify any code patterns that might be vulnerable.
|
| 312 |
-
4. Proposed fix: Suggest a specific fix for this vulnerability.
|
| 313 |
-
5. Human-readable explanation: Provide a clear explanation of the vulnerability and its implications.
|
| 314 |
-
|
| 315 |
-
Format your response as follows:
|
| 316 |
-
PROJECT_SEVERITY: [Your assessment]
|
| 317 |
-
IS_PROJECT_IMPACTED: [true/false]
|
| 318 |
-
IMPACTED_CODE: [Your assessment]
|
| 319 |
-
PROPOSED_FIX: [Your suggestion]
|
| 320 |
-
EXPLANATION: [Your explanation]
|
| 321 |
-
"""
|
| 322 |
-
|
| 323 |
-
logger.info(f"Sending SCA vulnerability for LLM analysis: {package_name}")
|
| 324 |
-
|
| 325 |
-
# Get LLM analysis
|
| 326 |
-
try:
|
| 327 |
-
analysis_response = llm.generate_text(prompt)
|
| 328 |
-
|
| 329 |
-
# Parse the response to extract the required fields
|
| 330 |
-
project_severity = extract_field(analysis_response, "PROJECT_SEVERITY")
|
| 331 |
-
is_project_impacted = extract_field(analysis_response, "IS_PROJECT_IMPACTED")
|
| 332 |
-
impacted_code = extract_field(analysis_response, "IMPACTED_CODE")
|
| 333 |
-
proposed_fix = extract_field(analysis_response, "PROPOSED_FIX")
|
| 334 |
-
explanation = extract_field(analysis_response, "EXPLANATION")
|
| 335 |
-
|
| 336 |
-
# Convert is_project_impacted to boolean
|
| 337 |
-
is_project_impacted_bool = False
|
| 338 |
-
if is_project_impacted.lower() == "true":
|
| 339 |
-
is_project_impacted_bool = True
|
| 340 |
-
|
| 341 |
-
# Add the analysis to the finding
|
| 342 |
-
finding['project_severity'] = project_severity
|
| 343 |
-
finding['is_project_impacted'] = is_project_impacted_bool
|
| 344 |
-
finding['impacted_code'] = impacted_code
|
| 345 |
-
finding['proposed_fix'] = proposed_fix
|
| 346 |
-
finding['explanation'] = explanation
|
| 347 |
-
|
| 348 |
-
# Keep the original LLM analysis fields for backward compatibility
|
| 349 |
-
finding['llm_analysis'] = {
|
| 350 |
-
'is_vulnerable': is_project_impacted_bool,
|
| 351 |
-
'confidence': 'medium',
|
| 352 |
-
'impact': project_severity,
|
| 353 |
-
'explanation': explanation,
|
| 354 |
-
'remediation': proposed_fix,
|
| 355 |
-
'provider': llm.provider_name,
|
| 356 |
-
'model': llm.model_name
|
| 357 |
-
}
|
| 358 |
-
|
| 359 |
-
logger.info(f"Successfully analyzed vulnerability for {package_name}")
|
| 360 |
-
except Exception as e:
|
| 361 |
-
logger.error(f"Error during LLM analysis: {e}")
|
| 362 |
# Set default values if analysis fails
|
| 363 |
finding['project_severity'] = finding.get('severity', 'unknown')
|
| 364 |
finding['is_project_impacted'] = True
|
|
@@ -371,58 +458,25 @@ EXPLANATION: [Your explanation]
|
|
| 371 |
'is_vulnerable': True,
|
| 372 |
'confidence': 'low',
|
| 373 |
'impact': finding.get('severity', 'unknown'),
|
| 374 |
-
'explanation': "Could not analyze with
|
| 375 |
'remediation': f"Update {package_name} to the latest version.",
|
| 376 |
'provider': llm.provider_name if llm else 'unknown',
|
| 377 |
'model': llm.model_name if llm else 'unknown'
|
| 378 |
}
|
| 379 |
|
| 380 |
-
|
| 381 |
except Exception as e:
|
| 382 |
-
logger.error(f"Error enhancing SCA finding with
|
| 383 |
# Keep the original finding if enhancement fails
|
| 384 |
enhanced_findings.append(finding)
|
| 385 |
|
| 386 |
# Update the findings in the result
|
| 387 |
sca_result['findings'] = enhanced_findings
|
| 388 |
sca_result['llm_enhanced'] = True
|
|
|
|
|
|
|
| 389 |
|
| 390 |
return sca_result
|
| 391 |
|
| 392 |
|
| 393 |
-
|
| 394 |
-
"""
|
| 395 |
-
Extract a field from the LLM response.
|
| 396 |
-
|
| 397 |
-
Args:
|
| 398 |
-
text (str): The LLM response text
|
| 399 |
-
field_name (str): The name of the field to extract
|
| 400 |
-
|
| 401 |
-
Returns:
|
| 402 |
-
str: The extracted field value, or a default message if not found
|
| 403 |
-
"""
|
| 404 |
-
import re
|
| 405 |
-
pattern = rf"{field_name}:\s*(.*?)(?:\n[A-Z_]+:|$)"
|
| 406 |
-
match = re.search(pattern, text, re.DOTALL)
|
| 407 |
-
if match:
|
| 408 |
-
return match.group(1).strip()
|
| 409 |
-
return f"No {field_name.lower()} provided."
|
| 410 |
-
|
| 411 |
-
|
| 412 |
-
def find_dependency_usage(repo_path: str, dependency: str, language: str) -> List[str]:
|
| 413 |
-
"""
|
| 414 |
-
Find code snippets that use the specified dependency.
|
| 415 |
-
|
| 416 |
-
Args:
|
| 417 |
-
repo_path (str): Path to the repository
|
| 418 |
-
dependency (str): Name of the dependency
|
| 419 |
-
language (str): Programming language
|
| 420 |
-
|
| 421 |
-
Returns:
|
| 422 |
-
List[str]: List of code snippets that use the dependency
|
| 423 |
-
"""
|
| 424 |
-
# This is a simplified implementation that would need to be expanded
|
| 425 |
-
# for a production system to properly find all usages of a dependency
|
| 426 |
-
|
| 427 |
-
# For now, return an empty list as a placeholder
|
| 428 |
-
return []
|
|
|
|
| 2 |
import tempfile
|
| 3 |
import shutil
|
| 4 |
import importlib
|
| 5 |
+
import importlib.resources
|
| 6 |
+
import pkg_resources
|
| 7 |
from typing import Dict, Any, List, Optional
|
| 8 |
|
| 9 |
from agent_piment_bleu.utils.git_utils import clone_repository
|
|
|
|
| 11 |
from agent_piment_bleu.reporting import generate_markdown_report
|
| 12 |
from agent_piment_bleu.llm import create_llm_provider, get_llm_config
|
| 13 |
from agent_piment_bleu.logger import get_logger
|
| 14 |
+
from agent_piment_bleu.agent import SecurityAgent
|
| 15 |
+
|
| 16 |
+
# Special URL for testing with the dummy vulnerable JS project
|
| 17 |
+
TEST_JS_VULN_URL = "test://js-vulnerable-project"
|
| 18 |
|
| 19 |
def analyze_repository(repo_url, use_llm=True, llm_provider=None):
|
| 20 |
"""
|
|
|
|
| 37 |
logger.info(f"Created temporary directory: {temp_dir}")
|
| 38 |
|
| 39 |
try:
|
| 40 |
+
# Check if this is a test URL for the dummy vulnerable JS project
|
| 41 |
+
if repo_url == TEST_JS_VULN_URL:
|
| 42 |
+
# Use the dummy project instead of cloning
|
| 43 |
+
logger.info(f"Using dummy vulnerable JS project for testing")
|
| 44 |
+
|
| 45 |
+
# Try multiple methods to find the examples directory
|
| 46 |
+
dummy_project_path = None
|
| 47 |
|
| 48 |
+
# Method 1: Try to find it relative to the package
|
| 49 |
+
try:
|
| 50 |
+
dummy_project_path = pkg_resources.resource_filename('agent_piment_bleu', '../examples/js_vuln')
|
| 51 |
+
if os.path.exists(dummy_project_path):
|
| 52 |
+
logger.info(f"Found dummy project using pkg_resources: {dummy_project_path}")
|
| 53 |
+
else:
|
| 54 |
+
dummy_project_path = None
|
| 55 |
+
except (ImportError, ModuleNotFoundError):
|
| 56 |
+
logger.debug("Could not find examples using pkg_resources")
|
| 57 |
+
|
| 58 |
+
# Method 2: Try to find it relative to the current file
|
| 59 |
+
if not dummy_project_path:
|
| 60 |
+
try:
|
| 61 |
+
dummy_project_path = os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))),
|
| 62 |
+
"examples", "js_vuln")
|
| 63 |
+
if os.path.exists(dummy_project_path):
|
| 64 |
+
logger.info(f"Found dummy project relative to package: {dummy_project_path}")
|
| 65 |
+
else:
|
| 66 |
+
dummy_project_path = None
|
| 67 |
+
except Exception as e:
|
| 68 |
+
logger.debug(f"Could not find examples relative to package: {e}")
|
| 69 |
+
|
| 70 |
+
# Method 3: Try to find it in the installation directory
|
| 71 |
+
if not dummy_project_path:
|
| 72 |
+
try:
|
| 73 |
+
import agent_piment_bleu
|
| 74 |
+
package_dir = os.path.dirname(os.path.dirname(agent_piment_bleu.__file__))
|
| 75 |
+
dummy_project_path = os.path.join(package_dir, "examples", "js_vuln")
|
| 76 |
+
if os.path.exists(dummy_project_path):
|
| 77 |
+
logger.info(f"Found dummy project in installation directory: {dummy_project_path}")
|
| 78 |
+
else:
|
| 79 |
+
dummy_project_path = None
|
| 80 |
+
except Exception as e:
|
| 81 |
+
logger.debug(f"Could not find examples in installation directory: {e}")
|
| 82 |
+
|
| 83 |
+
if not dummy_project_path or not os.path.exists(dummy_project_path):
|
| 84 |
+
error_msg = "Dummy project not found. Please ensure the examples/js_vuln directory is included in the package."
|
| 85 |
+
logger.error(error_msg)
|
| 86 |
+
return f"## Error\n\n{error_msg}"
|
| 87 |
+
|
| 88 |
+
# Copy the dummy project to the temp directory
|
| 89 |
+
try:
|
| 90 |
+
for item in os.listdir(dummy_project_path):
|
| 91 |
+
src = os.path.join(dummy_project_path, item)
|
| 92 |
+
dst = os.path.join(temp_dir, item)
|
| 93 |
+
if os.path.isdir(src):
|
| 94 |
+
shutil.copytree(src, dst)
|
| 95 |
+
else:
|
| 96 |
+
shutil.copy2(src, dst)
|
| 97 |
+
except TypeError as e:
|
| 98 |
+
if "expected str, bytes or os.PathLike object, not NoneType" in str(e):
|
| 99 |
+
error_msg = "Failed to access dummy project path. Path is None."
|
| 100 |
+
logger.error(error_msg)
|
| 101 |
+
return f"## Error\n\n{error_msg}"
|
| 102 |
+
raise
|
| 103 |
+
|
| 104 |
+
logger.info(f"Copied dummy project to {temp_dir}")
|
| 105 |
+
clone_result = {"success": True, "message": "Dummy project copied successfully"}
|
| 106 |
+
else:
|
| 107 |
+
# Clone the repository
|
| 108 |
+
logger.info(f"Cloning repository: {repo_url}")
|
| 109 |
+
clone_result = clone_repository(repo_url, temp_dir)
|
| 110 |
+
|
| 111 |
+
if not clone_result["success"]:
|
| 112 |
+
logger.error(f"Failed to clone repository: {clone_result['message']}")
|
| 113 |
+
return f"## Error\n\n{clone_result['message']}"
|
| 114 |
|
| 115 |
# Detect languages used in the repository
|
| 116 |
logger.info("Detecting project languages")
|
|
|
|
| 145 |
# Enhance SAST results with LLM if available
|
| 146 |
if llm and sast_result.get('success', False) and sast_result.get('findings', []):
|
| 147 |
logger.info(f"Enhancing SAST results with LLM for language: {language}")
|
| 148 |
+
sast_result = enhance_sast_with_llm(sast_result, llm, language, temp_dir)
|
| 149 |
scan_results.append(sast_result)
|
| 150 |
logger.info(f"SAST scan for {language} completed with {len(sast_result.get('findings', []))} findings")
|
| 151 |
|
|
|
|
| 281 |
}
|
| 282 |
|
| 283 |
|
| 284 |
+
def enhance_sast_with_llm(sast_result: Dict[str, Any], llm, language: str, repo_path: str = None) -> Dict[str, Any]:
|
| 285 |
"""
|
| 286 |
Enhance SAST results with LLM analysis.
|
| 287 |
|
|
|
|
| 289 |
sast_result (Dict[str, Any]): Original SAST results
|
| 290 |
llm: LLM provider instance
|
| 291 |
language (str): Programming language
|
| 292 |
+
repo_path (str, optional): Path to the repository for agent-based analysis
|
| 293 |
|
| 294 |
Returns:
|
| 295 |
Dict[str, Any]: Enhanced SAST results
|
| 296 |
"""
|
| 297 |
enhanced_findings = []
|
| 298 |
+
logger = get_logger()
|
| 299 |
+
|
| 300 |
+
# Create a security agent if repo_path is provided and not None
|
| 301 |
+
agent = None
|
| 302 |
+
if repo_path is not None:
|
| 303 |
+
try:
|
| 304 |
+
agent = SecurityAgent(llm, repo_path)
|
| 305 |
+
logger.info(f"Created SecurityAgent for SAST analysis in {repo_path}")
|
| 306 |
+
except Exception as e:
|
| 307 |
+
logger.error(f"Failed to create SecurityAgent: {e}")
|
| 308 |
|
| 309 |
for finding in sast_result.get('findings', []):
|
| 310 |
# Skip if no code snippet is available
|
|
|
|
| 313 |
continue
|
| 314 |
|
| 315 |
try:
|
| 316 |
+
# If we have an agent and repo_path, use the agent for more comprehensive analysis
|
| 317 |
+
if agent and repo_path:
|
| 318 |
+
# Prepare the finding for agent analysis by adding vulnerability_text
|
| 319 |
+
code_snippet = finding.get('code_snippet', '')
|
| 320 |
+
finding['vulnerability_text'] = f"""
|
| 321 |
+
Type: SAST Finding
|
| 322 |
+
Rule: {finding.get('rule', 'Unknown rule')}
|
| 323 |
+
Severity: {finding.get('severity', 'medium')}
|
| 324 |
+
Message: {finding.get('message', 'Unknown issue')}
|
| 325 |
+
File: {finding.get('file', 'Unknown file')}
|
| 326 |
+
Line: {finding.get('line', 'Unknown line')}
|
| 327 |
+
|
| 328 |
+
Code Snippet:
|
| 329 |
+
```{language}
|
| 330 |
+
{code_snippet}
|
| 331 |
+
```
|
| 332 |
+
"""
|
| 333 |
|
| 334 |
+
logger.info(f"Using SecurityAgent to analyze SAST finding: {finding.get('rule', 'Unknown rule')}")
|
| 335 |
+
|
| 336 |
+
try:
|
| 337 |
+
# The agent will explore the codebase and analyze the vulnerability
|
| 338 |
+
analyzed_finding = agent.analyze_vulnerability(finding)
|
| 339 |
+
enhanced_findings.append(analyzed_finding)
|
| 340 |
+
logger.info(f"Successfully analyzed SAST finding with SecurityAgent")
|
| 341 |
+
except Exception as e:
|
| 342 |
+
logger.error(f"Error during SecurityAgent SAST analysis: {e}")
|
| 343 |
+
# Fallback to simple analysis if agent fails
|
| 344 |
+
fallback_analysis = llm.analyze_code(
|
| 345 |
+
code=code_snippet,
|
| 346 |
+
language=language,
|
| 347 |
+
task='security'
|
| 348 |
+
)
|
| 349 |
+
|
| 350 |
+
# Add LLM analysis to the finding
|
| 351 |
+
finding['llm_analysis'] = {
|
| 352 |
+
'summary': fallback_analysis.get('summary', 'No summary provided'),
|
| 353 |
+
'issues': fallback_analysis.get('issues', []),
|
| 354 |
+
'provider': llm.provider_name,
|
| 355 |
+
'model': llm.model_name
|
| 356 |
+
}
|
| 357 |
+
|
| 358 |
+
enhanced_findings.append(finding)
|
| 359 |
+
else:
|
| 360 |
+
# Use the standard LLM analysis if no agent is available
|
| 361 |
+
code_snippet = finding.get('code_snippet', '')
|
| 362 |
+
analysis = llm.analyze_code(
|
| 363 |
+
code=code_snippet,
|
| 364 |
+
language=language,
|
| 365 |
+
task='security'
|
| 366 |
+
)
|
| 367 |
+
|
| 368 |
+
# Add LLM analysis to the finding
|
| 369 |
+
finding['llm_analysis'] = {
|
| 370 |
+
'summary': analysis.get('summary', 'No summary provided'),
|
| 371 |
+
'issues': analysis.get('issues', []),
|
| 372 |
+
'provider': llm.provider_name,
|
| 373 |
+
'model': llm.model_name
|
| 374 |
+
}
|
| 375 |
+
|
| 376 |
+
enhanced_findings.append(finding)
|
| 377 |
except Exception as e:
|
| 378 |
+
logger.error(f"Error enhancing SAST finding with LLM: {e}")
|
| 379 |
# Keep the original finding if enhancement fails
|
| 380 |
enhanced_findings.append(finding)
|
| 381 |
|
| 382 |
# Update the findings in the result
|
| 383 |
sast_result['findings'] = enhanced_findings
|
| 384 |
sast_result['llm_enhanced'] = True
|
| 385 |
+
if agent is not None and repo_path is not None:
|
| 386 |
+
sast_result['agent_enhanced'] = True # Mark as enhanced by the agent
|
| 387 |
|
| 388 |
return sast_result
|
| 389 |
|
| 390 |
|
| 391 |
+
def enhance_sca_with_llm(sca_result: Dict[str, Any], llm, language: str, repo_path: str = None) -> Dict[str, Any]:
|
| 392 |
"""
|
| 393 |
Enhance SCA results with LLM analysis.
|
| 394 |
|
|
|
|
| 396 |
sca_result (Dict[str, Any]): Original SCA results
|
| 397 |
llm: LLM provider instance
|
| 398 |
language (str): Programming language
|
| 399 |
+
repo_path (str, optional): Path to the repository
|
| 400 |
|
| 401 |
Returns:
|
| 402 |
Dict[str, Any]: Enhanced SCA results
|
|
|
|
| 404 |
enhanced_findings = []
|
| 405 |
logger = get_logger()
|
| 406 |
|
| 407 |
+
# Create a security agent if repo_path is provided and not None
|
| 408 |
+
agent = None
|
| 409 |
+
if repo_path is not None:
|
| 410 |
try:
|
| 411 |
+
agent = SecurityAgent(llm, repo_path)
|
| 412 |
+
logger.info(f"Created SecurityAgent for exploring {repo_path}")
|
| 413 |
+
except Exception as e:
|
| 414 |
+
logger.error(f"Failed to create SecurityAgent: {e}")
|
|
|
|
|
|
|
|
|
|
| 415 |
|
| 416 |
+
for finding in sca_result.get('findings', []):
|
| 417 |
+
try:
|
| 418 |
# Get vulnerability text for AI agent analysis
|
| 419 |
vulnerability_text = finding.get('vulnerability_text', '')
|
| 420 |
if not vulnerability_text:
|
| 421 |
# Create a text representation if not already present
|
| 422 |
+
package_name = finding.get('package', finding.get('package_name', ''))
|
| 423 |
vulnerability_text = f"""
|
| 424 |
Package: {package_name}
|
| 425 |
Version: {finding.get('version', 'unknown')}
|
|
|
|
| 427 |
Title: {finding.get('message', finding.get('title', 'Unknown vulnerability'))}
|
| 428 |
CVE: {finding.get('cve', 'N/A')}
|
| 429 |
"""
|
| 430 |
+
finding['vulnerability_text'] = vulnerability_text
|
| 431 |
+
|
| 432 |
+
logger.info(f"Using SecurityAgent to analyze vulnerability: {finding.get('cve', 'Unknown CVE')}")
|
| 433 |
+
|
| 434 |
+
# If we have an agent and repo_path, use the agent for more comprehensive analysis
|
| 435 |
+
if agent and repo_path:
|
| 436 |
+
try:
|
| 437 |
+
# The agent will explore the codebase and analyze the vulnerability
|
| 438 |
+
analyzed_finding = agent.analyze_vulnerability(finding)
|
| 439 |
+
enhanced_findings.append(analyzed_finding)
|
| 440 |
+
logger.info(f"Successfully analyzed vulnerability with SecurityAgent")
|
| 441 |
+
except Exception as e:
|
| 442 |
+
logger.error(f"Error during SecurityAgent analysis: {e}")
|
| 443 |
+
# Fallback to simple analysis if agent fails
|
| 444 |
+
package_name = finding.get('package', finding.get('package_name', ''))
|
| 445 |
+
else:
|
| 446 |
+
# Use a simpler analysis if no agent is available
|
| 447 |
+
package_name = finding.get('package', finding.get('package_name', ''))
|
| 448 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 449 |
# Set default values if analysis fails
|
| 450 |
finding['project_severity'] = finding.get('severity', 'unknown')
|
| 451 |
finding['is_project_impacted'] = True
|
|
|
|
| 458 |
'is_vulnerable': True,
|
| 459 |
'confidence': 'low',
|
| 460 |
'impact': finding.get('severity', 'unknown'),
|
| 461 |
+
'explanation': "Could not analyze with SecurityAgent.",
|
| 462 |
'remediation': f"Update {package_name} to the latest version.",
|
| 463 |
'provider': llm.provider_name if llm else 'unknown',
|
| 464 |
'model': llm.model_name if llm else 'unknown'
|
| 465 |
}
|
| 466 |
|
| 467 |
+
enhanced_findings.append(finding)
|
| 468 |
except Exception as e:
|
| 469 |
+
logger.error(f"Error enhancing SCA finding with SecurityAgent: {e}")
|
| 470 |
# Keep the original finding if enhancement fails
|
| 471 |
enhanced_findings.append(finding)
|
| 472 |
|
| 473 |
# Update the findings in the result
|
| 474 |
sca_result['findings'] = enhanced_findings
|
| 475 |
sca_result['llm_enhanced'] = True
|
| 476 |
+
if agent is not None and repo_path is not None:
|
| 477 |
+
sca_result['agent_enhanced'] = True # Mark as enhanced by the agent
|
| 478 |
|
| 479 |
return sca_result
|
| 480 |
|
| 481 |
|
| 482 |
+
# Old LLM analysis code removed - now using SecurityAgent for analysis
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
agent_piment_bleu/ui.py
CHANGED
|
@@ -1,6 +1,7 @@
|
|
| 1 |
import gradio as gr
|
| 2 |
-
from agent_piment_bleu.orchestrator import analyze_repository
|
| 3 |
from agent_piment_bleu.llm import get_available_providers, get_default_provider
|
|
|
|
| 4 |
from agent_piment_bleu.logger import get_logger
|
| 5 |
import json
|
| 6 |
|
|
@@ -31,6 +32,17 @@ def save_url(url):
|
|
| 31 |
"""
|
| 32 |
return url
|
| 33 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
def create_ui():
|
| 35 |
"""
|
| 36 |
Create the Gradio UI for AgentPimentBleu.
|
|
@@ -44,81 +56,190 @@ def create_ui():
|
|
| 44 |
# Define a callback function to update the logs in the UI
|
| 45 |
def ui_log_callback(log_content):
|
| 46 |
return gr.update(value=log_content)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 47 |
with gr.Blocks(title="AgentPimentBleu: Smart Security Scanner") as app:
|
| 48 |
gr.Markdown("# AgentPimentBleu: Smart Security Scanner for Git Repositories")
|
| 49 |
-
gr.Markdown("Enter a public Git repository URL to scan for security vulnerabilities.")
|
| 50 |
-
|
| 51 |
-
# Note: JavaScript localStorage functionality has been removed
|
| 52 |
-
# due to compatibility issues with the current Gradio version
|
| 53 |
-
|
| 54 |
-
with gr.Row():
|
| 55 |
-
repo_url = gr.Textbox(
|
| 56 |
-
label="Git Repository URL",
|
| 57 |
-
placeholder="https://github.com/username/repository",
|
| 58 |
-
info="Enter the URL of a public Git repository",
|
| 59 |
-
value=""
|
| 60 |
-
)
|
| 61 |
-
|
| 62 |
-
# Save URL when it changes
|
| 63 |
-
repo_url.change(fn=save_url, inputs=repo_url, outputs=repo_url)
|
| 64 |
-
|
| 65 |
-
with gr.Row():
|
| 66 |
-
with gr.Column(scale=1):
|
| 67 |
-
use_llm = gr.Checkbox(
|
| 68 |
-
label="Use LLM Enhancement",
|
| 69 |
-
value=True,
|
| 70 |
-
info="Enable AI-powered analysis of security findings"
|
| 71 |
-
)
|
| 72 |
|
| 73 |
-
|
| 74 |
-
|
| 75 |
-
|
| 76 |
-
|
| 77 |
-
|
| 78 |
-
|
| 79 |
-
|
| 80 |
-
|
| 81 |
-
|
| 82 |
-
|
| 83 |
-
|
| 84 |
-
|
| 85 |
-
|
| 86 |
-
|
| 87 |
-
|
| 88 |
-
|
| 89 |
-
|
| 90 |
-
|
| 91 |
-
|
| 92 |
-
|
| 93 |
-
|
| 94 |
-
|
| 95 |
-
|
| 96 |
-
|
| 97 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 98 |
)
|
| 99 |
|
| 100 |
-
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
|
| 105 |
-
interactive=False
|
| 106 |
)
|
| 107 |
|
| 108 |
-
|
| 109 |
-
|
| 110 |
-
|
| 111 |
-
|
| 112 |
-
|
| 113 |
-
|
| 114 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 115 |
)
|
| 116 |
|
| 117 |
-
|
| 118 |
-
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 122 |
|
| 123 |
# Set the UI callback for the logger
|
| 124 |
logger.set_ui_callback(ui_log_callback)
|
|
@@ -126,34 +247,4 @@ def create_ui():
|
|
| 126 |
# Log initial message
|
| 127 |
logger.info("AgentPimentBleu initialized and ready")
|
| 128 |
|
| 129 |
-
# Update status when scan starts and completes
|
| 130 |
-
scan_button.click(
|
| 131 |
-
fn=lambda: "Scanning...",
|
| 132 |
-
inputs=None,
|
| 133 |
-
outputs=status
|
| 134 |
-
).then(
|
| 135 |
-
fn=lambda: (logger.info("Starting repository scan..."), logger.get_logs_text())[1],
|
| 136 |
-
inputs=None,
|
| 137 |
-
outputs=logs
|
| 138 |
-
).then(
|
| 139 |
-
fn=analyze_repository,
|
| 140 |
-
inputs=[repo_url, use_llm, llm_provider],
|
| 141 |
-
outputs=report
|
| 142 |
-
).then(
|
| 143 |
-
fn=lambda: (logger.info("Scan completed"), logger.get_logs_text())[1],
|
| 144 |
-
inputs=None,
|
| 145 |
-
outputs=logs
|
| 146 |
-
).then(
|
| 147 |
-
fn=lambda: "Idle",
|
| 148 |
-
inputs=None,
|
| 149 |
-
outputs=status
|
| 150 |
-
)
|
| 151 |
-
|
| 152 |
-
# Disable/enable LLM provider dropdown based on checkbox
|
| 153 |
-
use_llm.change(
|
| 154 |
-
fn=lambda x: gr.update(interactive=x),
|
| 155 |
-
inputs=use_llm,
|
| 156 |
-
outputs=llm_provider
|
| 157 |
-
)
|
| 158 |
-
|
| 159 |
return app
|
|
|
|
| 1 |
import gradio as gr
|
| 2 |
+
from agent_piment_bleu.orchestrator import analyze_repository, TEST_JS_VULN_URL
|
| 3 |
from agent_piment_bleu.llm import get_available_providers, get_default_provider
|
| 4 |
+
from agent_piment_bleu.llm.factory import create_llm_provider
|
| 5 |
from agent_piment_bleu.logger import get_logger
|
| 6 |
import json
|
| 7 |
|
|
|
|
| 32 |
"""
|
| 33 |
return url
|
| 34 |
|
| 35 |
+
def use_dummy_project():
|
| 36 |
+
"""
|
| 37 |
+
Set the repository URL to the dummy vulnerable JS project.
|
| 38 |
+
|
| 39 |
+
Returns:
|
| 40 |
+
str: The dummy project URL
|
| 41 |
+
"""
|
| 42 |
+
logger = get_logger()
|
| 43 |
+
logger.info(f"Using dummy vulnerable JS project for testing: {TEST_JS_VULN_URL}")
|
| 44 |
+
return TEST_JS_VULN_URL
|
| 45 |
+
|
| 46 |
def create_ui():
|
| 47 |
"""
|
| 48 |
Create the Gradio UI for AgentPimentBleu.
|
|
|
|
| 56 |
# Define a callback function to update the logs in the UI
|
| 57 |
def ui_log_callback(log_content):
|
| 58 |
return gr.update(value=log_content)
|
| 59 |
+
|
| 60 |
+
# Function to analyze the dummy project with LLM
|
| 61 |
+
def analyze_cve_with_llm(llm_provider_name):
|
| 62 |
+
try:
|
| 63 |
+
logger.info(f"Analyzing dummy project with {llm_provider_name}...")
|
| 64 |
+
|
| 65 |
+
# Use the example project
|
| 66 |
+
result = analyze_repository(TEST_JS_VULN_URL, True, llm_provider_name)
|
| 67 |
+
|
| 68 |
+
logger.info("Dummy project analysis completed")
|
| 69 |
+
return result, logger.get_logs_text()
|
| 70 |
+
except Exception as e:
|
| 71 |
+
error_message = f"Error analyzing dummy project: {str(e)}"
|
| 72 |
+
logger.error(error_message)
|
| 73 |
+
return error_message, logger.get_logs_text()
|
| 74 |
+
|
| 75 |
with gr.Blocks(title="AgentPimentBleu: Smart Security Scanner") as app:
|
| 76 |
gr.Markdown("# AgentPimentBleu: Smart Security Scanner for Git Repositories")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 77 |
|
| 78 |
+
# Get available providers and their status
|
| 79 |
+
providers = get_available_providers()
|
| 80 |
+
available_providers = [provider for provider, available in providers.items() if available]
|
| 81 |
+
|
| 82 |
+
# Default to the configured default provider if available
|
| 83 |
+
default_provider = get_default_provider()
|
| 84 |
+
if default_provider not in available_providers:
|
| 85 |
+
default_provider = available_providers[0] if available_providers else None
|
| 86 |
+
|
| 87 |
+
# If no providers are available, set provider info
|
| 88 |
+
if not available_providers:
|
| 89 |
+
provider_info = "No LLM providers available. Please install Ollama or Modal."
|
| 90 |
+
elif "modal" not in available_providers and "ollama" in available_providers:
|
| 91 |
+
provider_info = "Only Ollama is available. Install Modal package with 'pip install modal' to use Modal."
|
| 92 |
+
else:
|
| 93 |
+
provider_info = "Select the LLM provider to use for analysis"
|
| 94 |
+
|
| 95 |
+
# Create tabs
|
| 96 |
+
with gr.Tabs():
|
| 97 |
+
# Repository Scanner Tab
|
| 98 |
+
with gr.TabItem("Repository Scanner"):
|
| 99 |
+
gr.Markdown("Enter a public Git repository URL to scan for security vulnerabilities.")
|
| 100 |
+
|
| 101 |
+
with gr.Row():
|
| 102 |
+
repo_url = gr.Textbox(
|
| 103 |
+
label="Git Repository URL",
|
| 104 |
+
placeholder="https://github.com/username/repository",
|
| 105 |
+
info="Enter the URL of a public Git repository",
|
| 106 |
+
value=""
|
| 107 |
+
)
|
| 108 |
+
|
| 109 |
+
# Save URL when it changes
|
| 110 |
+
repo_url.change(fn=save_url, inputs=repo_url, outputs=repo_url)
|
| 111 |
+
|
| 112 |
+
with gr.Row():
|
| 113 |
+
with gr.Column(scale=1):
|
| 114 |
+
use_llm = gr.Checkbox(
|
| 115 |
+
label="Use LLM Enhancement",
|
| 116 |
+
value=True,
|
| 117 |
+
info="Enable AI-powered analysis of security findings"
|
| 118 |
+
)
|
| 119 |
+
|
| 120 |
+
llm_provider = gr.Dropdown(
|
| 121 |
+
label="LLM Provider",
|
| 122 |
+
choices=available_providers,
|
| 123 |
+
value=default_provider,
|
| 124 |
+
interactive=bool(available_providers) and True,
|
| 125 |
+
info=provider_info
|
| 126 |
+
)
|
| 127 |
+
|
| 128 |
+
with gr.Column(scale=1):
|
| 129 |
+
scan_button = gr.Button("Scan Repository", variant="primary", scale=2)
|
| 130 |
+
status = gr.Textbox(
|
| 131 |
+
label="Status",
|
| 132 |
+
value="Idle",
|
| 133 |
+
interactive=False
|
| 134 |
+
)
|
| 135 |
+
|
| 136 |
+
with gr.Column(scale=1):
|
| 137 |
+
logs = gr.Textbox(
|
| 138 |
+
label="Logs",
|
| 139 |
+
value="",
|
| 140 |
+
lines=15,
|
| 141 |
+
max_lines=15,
|
| 142 |
+
interactive=False
|
| 143 |
+
)
|
| 144 |
+
|
| 145 |
+
with gr.Row():
|
| 146 |
+
report = gr.Markdown(
|
| 147 |
+
label="Scan Report",
|
| 148 |
+
value="Scan results will appear here."
|
| 149 |
+
)
|
| 150 |
+
|
| 151 |
+
# Update status when scan starts and completes
|
| 152 |
+
scan_button.click(
|
| 153 |
+
fn=lambda: "Scanning...",
|
| 154 |
+
inputs=None,
|
| 155 |
+
outputs=status
|
| 156 |
+
).then(
|
| 157 |
+
fn=lambda: (logger.info("Starting repository scan..."), logger.get_logs_text())[1],
|
| 158 |
+
inputs=None,
|
| 159 |
+
outputs=logs
|
| 160 |
+
).then(
|
| 161 |
+
fn=analyze_repository,
|
| 162 |
+
inputs=[repo_url, use_llm, llm_provider],
|
| 163 |
+
outputs=report
|
| 164 |
+
).then(
|
| 165 |
+
fn=lambda: (logger.info("Scan completed"), logger.get_logs_text())[1],
|
| 166 |
+
inputs=None,
|
| 167 |
+
outputs=logs
|
| 168 |
+
).then(
|
| 169 |
+
fn=lambda: "Idle",
|
| 170 |
+
inputs=None,
|
| 171 |
+
outputs=status
|
| 172 |
)
|
| 173 |
|
| 174 |
+
# Disable/enable LLM provider dropdown based on checkbox
|
| 175 |
+
use_llm.change(
|
| 176 |
+
fn=lambda x: gr.update(interactive=x),
|
| 177 |
+
inputs=use_llm,
|
| 178 |
+
outputs=llm_provider
|
|
|
|
| 179 |
)
|
| 180 |
|
| 181 |
+
# LLM Testing Tab
|
| 182 |
+
with gr.TabItem("LLM Testing"):
|
| 183 |
+
gr.Markdown("# Test LLM Functionality")
|
| 184 |
+
gr.Markdown("Test the LLM's ability to analyze vulnerabilities using the dummy vulnerable project.")
|
| 185 |
+
|
| 186 |
+
with gr.Row():
|
| 187 |
+
with gr.Column(scale=2):
|
| 188 |
+
gr.Markdown("### Dummy Project Analysis")
|
| 189 |
+
gr.Markdown("The dummy project contains intentional vulnerabilities including:")
|
| 190 |
+
gr.Markdown("- Vulnerable dependencies (lodash, axios, etc.)")
|
| 191 |
+
gr.Markdown("- Code with security issues (XSS, SSRF, command injection)")
|
| 192 |
+
gr.Markdown("- Realistic project structure to test exploration capabilities")
|
| 193 |
+
|
| 194 |
+
llm_test_provider = gr.Dropdown(
|
| 195 |
+
label="LLM Provider",
|
| 196 |
+
choices=available_providers,
|
| 197 |
+
value=default_provider,
|
| 198 |
+
interactive=bool(available_providers),
|
| 199 |
+
info=provider_info
|
| 200 |
+
)
|
| 201 |
+
|
| 202 |
+
analyze_button = gr.Button("Analyze Dummy Project", variant="primary")
|
| 203 |
+
|
| 204 |
+
gr.Markdown("---")
|
| 205 |
+
gr.Markdown("### Quick Setup for Repository Scanner")
|
| 206 |
+
gr.Markdown("This button automatically sets the dummy project URL in the Repository Scanner tab, so you can quickly test the full scanning functionality with the vulnerable example project.")
|
| 207 |
+
|
| 208 |
+
use_dummy_button = gr.Button("Use Dummy Project in Scanner Tab", variant="secondary")
|
| 209 |
+
|
| 210 |
+
with gr.Column(scale=2):
|
| 211 |
+
llm_result = gr.Textbox(
|
| 212 |
+
label="LLM Analysis Result",
|
| 213 |
+
lines=15,
|
| 214 |
+
max_lines=15,
|
| 215 |
+
interactive=False
|
| 216 |
+
)
|
| 217 |
+
|
| 218 |
+
llm_test_logs = gr.Textbox(
|
| 219 |
+
label="Logs",
|
| 220 |
+
value="",
|
| 221 |
+
lines=5,
|
| 222 |
+
max_lines=5,
|
| 223 |
+
interactive=False
|
| 224 |
+
)
|
| 225 |
+
|
| 226 |
+
# Set up the analyze button click event
|
| 227 |
+
analyze_button.click(
|
| 228 |
+
fn=analyze_cve_with_llm,
|
| 229 |
+
inputs=[llm_test_provider],
|
| 230 |
+
outputs=[llm_result, llm_test_logs]
|
| 231 |
)
|
| 232 |
|
| 233 |
+
# Set up the use dummy project button click event
|
| 234 |
+
use_dummy_button.click(
|
| 235 |
+
fn=use_dummy_project,
|
| 236 |
+
inputs=None,
|
| 237 |
+
outputs=repo_url
|
| 238 |
+
).then(
|
| 239 |
+
fn=lambda: (logger.info("Switched to dummy vulnerable JS project"), logger.get_logs_text())[1],
|
| 240 |
+
inputs=None,
|
| 241 |
+
outputs=llm_test_logs
|
| 242 |
+
)
|
| 243 |
|
| 244 |
# Set the UI callback for the logger
|
| 245 |
logger.set_ui_callback(ui_log_callback)
|
|
|
|
| 247 |
# Log initial message
|
| 248 |
logger.info("AgentPimentBleu initialized and ready")
|
| 249 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 250 |
return app
|
dev_context/ROADMAP.md
CHANGED
|
@@ -28,28 +28,28 @@ This document outlines the development roadmap for AgentPimentBleu, an AI-powere
|
|
| 28 |
* [x] Integrate JavaScript SAST using ESLint with security plugins.
|
| 29 |
* [x] Integrate Python SAST using Bandit.
|
| 30 |
* [x] Parse basic output from the SAST tools.
|
| 31 |
-
* [
|
| 32 |
-
* Send a few example SAST findings (code snippets) to an LLM.
|
| 33 |
-
* Prompt LLM for a human-readable explanation of the risk.
|
| 34 |
4. **SCA Integration - Initial Pass:**
|
| 35 |
* [x] Integrate JavaScript SCA using npm audit.
|
| 36 |
* [x] Integrate Python SCA using pip-audit.
|
| 37 |
* [x] Parse basic dependency and CVE information.
|
| 38 |
5. **⭐ AI-Powered Dependency Impact Assessment (Core Feature):**
|
| 39 |
-
* [
|
| 40 |
-
* [
|
| 41 |
-
* [
|
| 42 |
-
* [
|
| 43 |
-
* **Project Severity Note:** Assessment of the severity of the vulnerability for the specific project.
|
| 44 |
-
* **Is Project Impacted:** Determination of whether the project is likely impacted by the vulnerability (true/false).
|
| 45 |
-
* **Potentially Impacted Code:** Identification of code patterns that might be vulnerable.
|
| 46 |
-
* **Proposed Fix:** Specific suggestions for fixing the vulnerability.
|
| 47 |
-
* **Human-Readable Explanation:** Clear explanation of the vulnerability and its implications.
|
| 48 |
6. **Report Generation & Display:**
|
| 49 |
-
* [
|
| 50 |
-
* SAST findings (with any initial LLM comments).
|
| 51 |
-
* SCA findings, highlighting those with AI-assessed impact.
|
| 52 |
-
* [
|
| 53 |
7. **Hackathon Submission Requirements:**
|
| 54 |
* [ ] Working Gradio app deployed as a Hugging Face Space.
|
| 55 |
* [ ] `README.md` in the Space with the `agent-demo-track` tag.
|
|
@@ -61,6 +61,12 @@ This document outlines the development roadmap for AgentPimentBleu, an AI-powere
|
|
| 61 |
|
| 62 |
**Goal:** Improve the robustness, accuracy, and usability of the MVP. Expand initial capabilities.
|
| 63 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 64 |
* **Enhanced SAST & SCA: ✓**
|
| 65 |
* [x] Implement modular architecture with standardized scanner interfaces
|
| 66 |
* [x] Support for multiple programming languages (JavaScript and Python)
|
|
@@ -72,12 +78,12 @@ This document outlines the development roadmap for AgentPimentBleu, an AI-powere
|
|
| 72 |
* [x] Implement language detection to determine project types
|
| 73 |
* [x] Dynamically select appropriate scanners based on detected languages
|
| 74 |
* **Improved LLM Integration & Prompt Engineering:**
|
| 75 |
-
* [
|
| 76 |
-
* [
|
| 77 |
-
* [
|
| 78 |
-
* [
|
| 79 |
* **Advanced Code Usage Analysis (for SCA Impact):**
|
| 80 |
-
* [
|
| 81 |
* **Gradio UI Enhancements:**
|
| 82 |
* [ ] More interactive report display (e.g., collapsible sections, severity filtering, links to CVE details)
|
| 83 |
* [ ] Clearer progress indicators and error messages
|
|
|
|
| 28 |
* [x] Integrate JavaScript SAST using ESLint with security plugins.
|
| 29 |
* [x] Integrate Python SAST using Bandit.
|
| 30 |
* [x] Parse basic output from the SAST tools.
|
| 31 |
+
* [x] **LLM Enhancement (Proof of Concept):**
|
| 32 |
+
* [x] Send a few example SAST findings (code snippets) to an LLM.
|
| 33 |
+
* [x] Prompt LLM for a human-readable explanation of the risk.
|
| 34 |
4. **SCA Integration - Initial Pass:**
|
| 35 |
* [x] Integrate JavaScript SCA using npm audit.
|
| 36 |
* [x] Integrate Python SCA using pip-audit.
|
| 37 |
* [x] Parse basic dependency and CVE information.
|
| 38 |
5. **⭐ AI-Powered Dependency Impact Assessment (Core Feature):**
|
| 39 |
+
* [x] For identified vulnerable dependencies:
|
| 40 |
+
* [x] Basic code searching mechanism to identify where the dependency is imported/used (e.g., simple string matching for `import library_name`).
|
| 41 |
+
* [x] Send CVE information + project usage snippets to an LLM.
|
| 42 |
+
* [x] **Prompt LLM to generate a comprehensive security vulnerability report with five key components:**
|
| 43 |
+
* [x] **Project Severity Note:** Assessment of the severity of the vulnerability for the specific project.
|
| 44 |
+
* [x] **Is Project Impacted:** Determination of whether the project is likely impacted by the vulnerability (true/false).
|
| 45 |
+
* [x] **Potentially Impacted Code:** Identification of code patterns that might be vulnerable.
|
| 46 |
+
* [x] **Proposed Fix:** Specific suggestions for fixing the vulnerability.
|
| 47 |
+
* [x] **Human-Readable Explanation:** Clear explanation of the vulnerability and its implications.
|
| 48 |
6. **Report Generation & Display:**
|
| 49 |
+
* [x] Structure the output to clearly differentiate:
|
| 50 |
+
* [x] SAST findings (with any initial LLM comments).
|
| 51 |
+
* [x] SCA findings, highlighting those with AI-assessed impact.
|
| 52 |
+
* [x] Present findings in a readable Markdown format within the Gradio UI.
|
| 53 |
7. **Hackathon Submission Requirements:**
|
| 54 |
* [ ] Working Gradio app deployed as a Hugging Face Space.
|
| 55 |
* [ ] `README.md` in the Space with the `agent-demo-track` tag.
|
|
|
|
| 61 |
|
| 62 |
**Goal:** Improve the robustness, accuracy, and usability of the MVP. Expand initial capabilities.
|
| 63 |
|
| 64 |
+
* **Intelligent Agent for Codebase Exploration: ✓**
|
| 65 |
+
* [x] Create a dedicated agent class for exploring codebases and analyzing vulnerabilities
|
| 66 |
+
* [x] Implement project structure analysis (similar to tree command output)
|
| 67 |
+
* [x] Add file exploration capabilities (reading files, searching for patterns)
|
| 68 |
+
* [x] Implement a multi-step analysis process: analyze CVE, explore codebase, generate report
|
| 69 |
+
|
| 70 |
* **Enhanced SAST & SCA: ✓**
|
| 71 |
* [x] Implement modular architecture with standardized scanner interfaces
|
| 72 |
* [x] Support for multiple programming languages (JavaScript and Python)
|
|
|
|
| 78 |
* [x] Implement language detection to determine project types
|
| 79 |
* [x] Dynamically select appropriate scanners based on detected languages
|
| 80 |
* **Improved LLM Integration & Prompt Engineering:**
|
| 81 |
+
* [x] Refine prompts for better accuracy in impact assessment and code analysis
|
| 82 |
+
* [x] Develop more sophisticated methods for selecting and sending relevant code context to the LLM
|
| 83 |
+
* [x] Explore techniques to reduce LLM hallucination and improve consistency
|
| 84 |
+
* [x] Handle LLM API errors gracefully
|
| 85 |
* **Advanced Code Usage Analysis (for SCA Impact):**
|
| 86 |
+
* [x] Move beyond simple import checking to identify specific function/method calls related to CVEs (implemented through the SecurityAgent's codebase exploration capabilities)
|
| 87 |
* **Gradio UI Enhancements:**
|
| 88 |
* [ ] More interactive report display (e.g., collapsible sections, severity filtering, links to CVE details)
|
| 89 |
* [ ] Clearer progress indicators and error messages
|
examples/js_vuln/README.md
ADDED
|
@@ -0,0 +1,49 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Simple Web Application
|
| 2 |
+
|
| 3 |
+
A simple web application built with Express.js for demonstration purposes.
|
| 4 |
+
|
| 5 |
+
## Features
|
| 6 |
+
|
| 7 |
+
- RESTful API endpoints
|
| 8 |
+
- User authentication
|
| 9 |
+
- File handling
|
| 10 |
+
- Search functionality
|
| 11 |
+
- Proxy capabilities
|
| 12 |
+
|
| 13 |
+
## Installation
|
| 14 |
+
|
| 15 |
+
```bash
|
| 16 |
+
npm install
|
| 17 |
+
```
|
| 18 |
+
|
| 19 |
+
## Usage
|
| 20 |
+
|
| 21 |
+
```bash
|
| 22 |
+
npm start
|
| 23 |
+
```
|
| 24 |
+
|
| 25 |
+
The server will start on port 3000 by default. You can change this by setting the PORT environment variable.
|
| 26 |
+
|
| 27 |
+
## API Endpoints
|
| 28 |
+
|
| 29 |
+
- `GET /` - Home page
|
| 30 |
+
- `GET /exec` - Execute commands
|
| 31 |
+
- `GET /file` - Retrieve files
|
| 32 |
+
- `POST /merge` - Merge objects
|
| 33 |
+
- `GET /proxy` - Proxy requests to other servers
|
| 34 |
+
- `GET /search` - Search functionality
|
| 35 |
+
- `GET /user` - User information
|
| 36 |
+
|
| 37 |
+
## Dependencies
|
| 38 |
+
|
| 39 |
+
- express
|
| 40 |
+
- lodash
|
| 41 |
+
- moment
|
| 42 |
+
- axios
|
| 43 |
+
- minimist
|
| 44 |
+
- node-fetch
|
| 45 |
+
- handlebars
|
| 46 |
+
|
| 47 |
+
## License
|
| 48 |
+
|
| 49 |
+
MIT
|
examples/js_vuln/app.js
ADDED
|
@@ -0,0 +1,123 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
const express = require('express');
|
| 2 |
+
const path = require('path');
|
| 3 |
+
const fs = require('fs');
|
| 4 |
+
const _ = require('lodash');
|
| 5 |
+
const moment = require('moment');
|
| 6 |
+
const axios = require('axios');
|
| 7 |
+
const minimist = require('minimist');
|
| 8 |
+
const fetch = require('node-fetch');
|
| 9 |
+
const handlebars = require('handlebars');
|
| 10 |
+
|
| 11 |
+
const app = express();
|
| 12 |
+
const port = process.env.PORT || 3000;
|
| 13 |
+
|
| 14 |
+
// Parse JSON body
|
| 15 |
+
app.use(express.json());
|
| 16 |
+
app.use(express.urlencoded({ extended: true }));
|
| 17 |
+
|
| 18 |
+
// Serve static files
|
| 19 |
+
app.use(express.static(path.join(__dirname, 'public')));
|
| 20 |
+
|
| 21 |
+
// Set up handlebars as the view engine
|
| 22 |
+
app.set('view engine', 'handlebars');
|
| 23 |
+
|
| 24 |
+
// Routes
|
| 25 |
+
app.get('/', (req, res) => {
|
| 26 |
+
res.render('index', { title: 'Home Page' });
|
| 27 |
+
});
|
| 28 |
+
|
| 29 |
+
// Vulnerable endpoint - Command Injection
|
| 30 |
+
app.get('/exec', (req, res) => {
|
| 31 |
+
const command = req.query.cmd;
|
| 32 |
+
const { exec } = require('child_process');
|
| 33 |
+
|
| 34 |
+
// Vulnerable: Direct use of user input in exec
|
| 35 |
+
exec(command, (error, stdout, stderr) => {
|
| 36 |
+
if (error) {
|
| 37 |
+
return res.status(500).send(stderr);
|
| 38 |
+
}
|
| 39 |
+
res.send(stdout);
|
| 40 |
+
});
|
| 41 |
+
});
|
| 42 |
+
|
| 43 |
+
// Vulnerable endpoint - Path Traversal
|
| 44 |
+
app.get('/file', (req, res) => {
|
| 45 |
+
const fileName = req.query.name;
|
| 46 |
+
|
| 47 |
+
// Vulnerable: No path validation
|
| 48 |
+
const filePath = path.join(__dirname, 'files', fileName);
|
| 49 |
+
|
| 50 |
+
fs.readFile(filePath, 'utf8', (err, data) => {
|
| 51 |
+
if (err) {
|
| 52 |
+
return res.status(404).send('File not found');
|
| 53 |
+
}
|
| 54 |
+
res.send(data);
|
| 55 |
+
});
|
| 56 |
+
});
|
| 57 |
+
|
| 58 |
+
// Vulnerable endpoint - Prototype Pollution
|
| 59 |
+
app.post('/merge', (req, res) => {
|
| 60 |
+
const userObj = req.body;
|
| 61 |
+
const defaultObj = { role: 'user', permissions: [] };
|
| 62 |
+
|
| 63 |
+
// Vulnerable: Using lodash.merge can lead to prototype pollution
|
| 64 |
+
const result = _.merge({}, defaultObj, userObj);
|
| 65 |
+
|
| 66 |
+
res.json(result);
|
| 67 |
+
});
|
| 68 |
+
|
| 69 |
+
// Vulnerable endpoint - SSRF
|
| 70 |
+
app.get('/proxy', async (req, res) => {
|
| 71 |
+
const url = req.query.url;
|
| 72 |
+
|
| 73 |
+
try {
|
| 74 |
+
// Vulnerable: No URL validation
|
| 75 |
+
const response = await axios.get(url);
|
| 76 |
+
res.json(response.data);
|
| 77 |
+
} catch (error) {
|
| 78 |
+
res.status(500).send('Error fetching URL');
|
| 79 |
+
}
|
| 80 |
+
});
|
| 81 |
+
|
| 82 |
+
// Vulnerable endpoint - XSS
|
| 83 |
+
app.get('/search', (req, res) => {
|
| 84 |
+
const query = req.query.q;
|
| 85 |
+
|
| 86 |
+
// Vulnerable: Directly inserting user input into HTML
|
| 87 |
+
const html = `
|
| 88 |
+
<html>
|
| 89 |
+
<head><title>Search Results</title></head>
|
| 90 |
+
<body>
|
| 91 |
+
<h1>Search Results for: ${query}</h1>
|
| 92 |
+
<div id="results"></div>
|
| 93 |
+
<script>
|
| 94 |
+
document.getElementById('results').innerHTML = 'You searched for: ${query}';
|
| 95 |
+
</script>
|
| 96 |
+
</body>
|
| 97 |
+
</html>
|
| 98 |
+
`;
|
| 99 |
+
|
| 100 |
+
res.send(html);
|
| 101 |
+
});
|
| 102 |
+
|
| 103 |
+
// Vulnerable endpoint - NoSQL Injection
|
| 104 |
+
app.get('/user', (req, res) => {
|
| 105 |
+
const username = req.query.username;
|
| 106 |
+
|
| 107 |
+
// This is just a simulation since we don't have a real DB
|
| 108 |
+
// But this pattern would be vulnerable to NoSQL injection
|
| 109 |
+
const query = { username: username };
|
| 110 |
+
|
| 111 |
+
// Simulating a database response
|
| 112 |
+
res.json({
|
| 113 |
+
message: `User query executed with: ${JSON.stringify(query)}`,
|
| 114 |
+
user: { username, email: `${username}@example.com` }
|
| 115 |
+
});
|
| 116 |
+
});
|
| 117 |
+
|
| 118 |
+
// Start the server
|
| 119 |
+
app.listen(port, () => {
|
| 120 |
+
console.log(`Server running on port ${port}`);
|
| 121 |
+
});
|
| 122 |
+
|
| 123 |
+
module.exports = app;
|
examples/js_vuln/package.json
ADDED
|
@@ -0,0 +1,32 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"name": "simple-web-app",
|
| 3 |
+
"version": "1.0.0",
|
| 4 |
+
"description": "A simple web application for demonstration purposes",
|
| 5 |
+
"main": "app.js",
|
| 6 |
+
"scripts": {
|
| 7 |
+
"start": "node app.js",
|
| 8 |
+
"test": "jest"
|
| 9 |
+
},
|
| 10 |
+
"keywords": [
|
| 11 |
+
"web",
|
| 12 |
+
"app",
|
| 13 |
+
"demo"
|
| 14 |
+
],
|
| 15 |
+
"author": "Demo User",
|
| 16 |
+
"license": "MIT",
|
| 17 |
+
"dependencies": {
|
| 18 |
+
"express": "4.16.0",
|
| 19 |
+
"lodash": "4.17.5",
|
| 20 |
+
"moment": "2.19.3",
|
| 21 |
+
"jquery": "3.3.1",
|
| 22 |
+
"axios": "0.18.0",
|
| 23 |
+
"minimist": "1.2.0",
|
| 24 |
+
"node-fetch": "2.3.0",
|
| 25 |
+
"handlebars": "4.0.11"
|
| 26 |
+
},
|
| 27 |
+
"devDependencies": {
|
| 28 |
+
"jest": "23.6.0",
|
| 29 |
+
"mocha": "5.2.0",
|
| 30 |
+
"eslint": "4.18.2"
|
| 31 |
+
}
|
| 32 |
+
}
|
examples/js_vuln/utils.js
ADDED
|
@@ -0,0 +1,89 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
/**
|
| 2 |
+
* Utility functions for the application
|
| 3 |
+
*/
|
| 4 |
+
|
| 5 |
+
const crypto = require('crypto');
|
| 6 |
+
|
| 7 |
+
/**
|
| 8 |
+
* Generate a random token
|
| 9 |
+
* @param {number} length - Length of the token
|
| 10 |
+
* @returns {string} Random token
|
| 11 |
+
*/
|
| 12 |
+
function generateToken(length = 32) {
|
| 13 |
+
return crypto.randomBytes(length).toString('hex');
|
| 14 |
+
}
|
| 15 |
+
|
| 16 |
+
/**
|
| 17 |
+
* Validate user input
|
| 18 |
+
* @param {object} input - User input object
|
| 19 |
+
* @param {array} requiredFields - Required fields
|
| 20 |
+
* @returns {object} Validation result
|
| 21 |
+
*/
|
| 22 |
+
function validateInput(input, requiredFields) {
|
| 23 |
+
const errors = [];
|
| 24 |
+
|
| 25 |
+
// Check required fields
|
| 26 |
+
for (const field of requiredFields) {
|
| 27 |
+
if (!input[field]) {
|
| 28 |
+
errors.push(`${field} is required`);
|
| 29 |
+
}
|
| 30 |
+
}
|
| 31 |
+
|
| 32 |
+
return {
|
| 33 |
+
isValid: errors.length === 0,
|
| 34 |
+
errors
|
| 35 |
+
};
|
| 36 |
+
}
|
| 37 |
+
|
| 38 |
+
/**
|
| 39 |
+
* Sanitize user input - VULNERABLE: Incomplete sanitization
|
| 40 |
+
* @param {string} input - User input
|
| 41 |
+
* @returns {string} Sanitized input
|
| 42 |
+
*/
|
| 43 |
+
function sanitizeInput(input) {
|
| 44 |
+
// This is an incomplete sanitization that doesn't properly handle all XSS vectors
|
| 45 |
+
return input
|
| 46 |
+
.replace(/</g, '<')
|
| 47 |
+
.replace(/>/g, '>');
|
| 48 |
+
}
|
| 49 |
+
|
| 50 |
+
/**
|
| 51 |
+
* Parse query parameters - VULNERABLE: Doesn't handle parameter pollution
|
| 52 |
+
* @param {string} queryString - Query string
|
| 53 |
+
* @returns {object} Parsed parameters
|
| 54 |
+
*/
|
| 55 |
+
function parseQueryParams(queryString) {
|
| 56 |
+
const params = {};
|
| 57 |
+
const pairs = queryString.split('&');
|
| 58 |
+
|
| 59 |
+
for (const pair of pairs) {
|
| 60 |
+
const [key, value] = pair.split('=');
|
| 61 |
+
params[key] = decodeURIComponent(value || '');
|
| 62 |
+
}
|
| 63 |
+
|
| 64 |
+
return params;
|
| 65 |
+
}
|
| 66 |
+
|
| 67 |
+
/**
|
| 68 |
+
* Log user activity
|
| 69 |
+
* @param {string} userId - User ID
|
| 70 |
+
* @param {string} action - Action performed
|
| 71 |
+
* @param {object} data - Additional data
|
| 72 |
+
*/
|
| 73 |
+
function logActivity(userId, action, data = {}) {
|
| 74 |
+
const timestamp = new Date().toISOString();
|
| 75 |
+
console.log(JSON.stringify({
|
| 76 |
+
timestamp,
|
| 77 |
+
userId,
|
| 78 |
+
action,
|
| 79 |
+
data
|
| 80 |
+
}));
|
| 81 |
+
}
|
| 82 |
+
|
| 83 |
+
module.exports = {
|
| 84 |
+
generateToken,
|
| 85 |
+
validateInput,
|
| 86 |
+
sanitizeInput,
|
| 87 |
+
parseQueryParams,
|
| 88 |
+
logActivity
|
| 89 |
+
};
|
examples/js_vuln/views/index.handlebars
ADDED
|
@@ -0,0 +1,86 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<!DOCTYPE html>
|
| 2 |
+
<html lang="en">
|
| 3 |
+
<head>
|
| 4 |
+
<meta charset="UTF-8">
|
| 5 |
+
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
| 6 |
+
<title>{{title}}</title>
|
| 7 |
+
<link rel="stylesheet" href="/css/style.css">
|
| 8 |
+
<script src="https://code.jquery.com/jquery-3.3.1.min.js"></script>
|
| 9 |
+
</head>
|
| 10 |
+
<body>
|
| 11 |
+
<header>
|
| 12 |
+
<h1>Simple Web Application</h1>
|
| 13 |
+
<nav>
|
| 14 |
+
<ul>
|
| 15 |
+
<li><a href="/">Home</a></li>
|
| 16 |
+
<li><a href="/search?q=test">Search</a></li>
|
| 17 |
+
<li><a href="/user?username=admin">User</a></li>
|
| 18 |
+
</ul>
|
| 19 |
+
</nav>
|
| 20 |
+
</header>
|
| 21 |
+
|
| 22 |
+
<main>
|
| 23 |
+
<section class="welcome">
|
| 24 |
+
<h2>Welcome to our application!</h2>
|
| 25 |
+
<p>This is a simple web application for demonstration purposes.</p>
|
| 26 |
+
</section>
|
| 27 |
+
|
| 28 |
+
<section class="features">
|
| 29 |
+
<h2>Features</h2>
|
| 30 |
+
<ul>
|
| 31 |
+
<li>RESTful API endpoints</li>
|
| 32 |
+
<li>User authentication</li>
|
| 33 |
+
<li>File handling</li>
|
| 34 |
+
<li>Search functionality</li>
|
| 35 |
+
<li>Proxy capabilities</li>
|
| 36 |
+
</ul>
|
| 37 |
+
</section>
|
| 38 |
+
|
| 39 |
+
<section class="demo">
|
| 40 |
+
<h2>Try it out</h2>
|
| 41 |
+
|
| 42 |
+
<div class="demo-box">
|
| 43 |
+
<h3>Search</h3>
|
| 44 |
+
<form action="/search" method="GET">
|
| 45 |
+
<input type="text" name="q" placeholder="Enter search term">
|
| 46 |
+
<button type="submit">Search</button>
|
| 47 |
+
</form>
|
| 48 |
+
</div>
|
| 49 |
+
|
| 50 |
+
<div class="demo-box">
|
| 51 |
+
<h3>User Lookup</h3>
|
| 52 |
+
<form action="/user" method="GET">
|
| 53 |
+
<input type="text" name="username" placeholder="Enter username">
|
| 54 |
+
<button type="submit">Look up</button>
|
| 55 |
+
</form>
|
| 56 |
+
</div>
|
| 57 |
+
|
| 58 |
+
<div class="demo-box">
|
| 59 |
+
<h3>File Retrieval</h3>
|
| 60 |
+
<form action="/file" method="GET">
|
| 61 |
+
<input type="text" name="name" placeholder="Enter file name">
|
| 62 |
+
<button type="submit">Get File</button>
|
| 63 |
+
</form>
|
| 64 |
+
</div>
|
| 65 |
+
</section>
|
| 66 |
+
</main>
|
| 67 |
+
|
| 68 |
+
<footer>
|
| 69 |
+
<p>© 2023 Simple Web Application. All rights reserved.</p>
|
| 70 |
+
</footer>
|
| 71 |
+
|
| 72 |
+
<script>
|
| 73 |
+
// Vulnerable: jQuery usage with potential XSS
|
| 74 |
+
$(document).ready(function() {
|
| 75 |
+
// Get URL parameters
|
| 76 |
+
const urlParams = new URLSearchParams(window.location.search);
|
| 77 |
+
const message = urlParams.get('message');
|
| 78 |
+
|
| 79 |
+
// Vulnerable: Directly inserting URL parameter into DOM
|
| 80 |
+
if (message) {
|
| 81 |
+
$('#message-container').html('<div class="message">' + message + '</div>');
|
| 82 |
+
}
|
| 83 |
+
});
|
| 84 |
+
</script>
|
| 85 |
+
</body>
|
| 86 |
+
</html>
|
setup.py
CHANGED
|
@@ -16,9 +16,12 @@ setup(
|
|
| 16 |
author="Brieuc Crosson",
|
| 17 |
author_email="briossant.com@gmail.com",
|
| 18 |
url="https://github.com/briossant/AgentPimentBleu",
|
| 19 |
-
packages=find_packages(),
|
| 20 |
py_modules=["app"],
|
| 21 |
include_package_data=True,
|
|
|
|
|
|
|
|
|
|
| 22 |
install_requires=requirements,
|
| 23 |
entry_points={
|
| 24 |
"console_scripts": [
|
|
|
|
| 16 |
author="Brieuc Crosson",
|
| 17 |
author_email="briossant.com@gmail.com",
|
| 18 |
url="https://github.com/briossant/AgentPimentBleu",
|
| 19 |
+
packages=find_packages() + ['examples', 'examples.js_vuln', 'examples.js_vuln.views'],
|
| 20 |
py_modules=["app"],
|
| 21 |
include_package_data=True,
|
| 22 |
+
package_data={
|
| 23 |
+
'examples': ['js_vuln/*', 'js_vuln/views/*'],
|
| 24 |
+
},
|
| 25 |
install_requires=requirements,
|
| 26 |
entry_points={
|
| 27 |
"console_scripts": [
|