codeSentry / README.md
YashashviAlva's picture
Initial commit for HF Spaces deploy
7b4f5dd
metadata
title: CodeSentry
emoji: πŸ›‘οΈ
colorFrom: indigo
colorTo: purple
sdk: docker
pinned: false
license: mit
app_port: 7860

πŸ›‘οΈ CodeSentry

CodeSentry is an enterprise-grade, agentic AI security and performance copilot designed to seamlessly analyze codebases, identify critical vulnerabilities, and generate intelligent, ready-to-merge patches β€” with built-in CUDA β†’ ROCm migration guidance for AMD hardware.

Built with a strict Zero Data Retention (ZDR) architecture, CodeSentry ensures that your proprietary code never leaves your secure environment or gets used for model training, making it perfect for highly sensitive, enterprise-scale environments.


✨ Key Features

  • 🧠 Agentic Pipeline: CodeSentry uses a multi-agent orchestration architecture:
    • Security Agent: Combines lightning-fast static analysis with deep semantic LLM reasoning to catch complex vulnerabilities (e.g., prompt injections, hardcoded secrets, unsafe deserialization).
    • Performance Agent: Specifically tailored to analyze ML/AI logic. It detects GPU memory bottlenecks, inefficient loop structures, and suggests hardware-native optimizations (like bfloat16 for AMD MI300X).
    • Fix Agent: Automatically generates unified Git-style diffs and line-by-line patch recommendations for every finding.
    • AMD Migration Advisor: Scans for 10 categories of CUDA-specific patterns (nvidia-smi, CUDA_VISIBLE_DEVICES, BitsAndBytes, cuDNN, FP16 usage, etc.) and provides actionable ROCm/HIP migration guidance with a 0–100 AMD Compatibility Score.
  • ⚑ AMD MI300X Live Metrics: Real-time GPU performance monitoring (utilization, VRAM, temperature, power draw, inference speed) streamed to the dashboard during every scan via SSE. Uses rocm-smi on AMD hardware, with simulated fallback for development environments.
  • πŸ”’ Zero Data Retention (ZDR): Every analysis session generates a unique cryptographic Privacy Certificate. The backend actively blocks outgoing network calls during the scan and wipes all data from memory the millisecond the scan completes.
  • ⚑ Real-Time Streaming: The analysis engine uses Server-Sent Events (SSE) to stream findings to the frontend instantaneously, creating a highly responsive "live" dashboard experience.
  • πŸ“‹ One-Click Reporting: Export full SECURITY_REPORT.md documents, structured JSON audit logs, copy-paste ready GitHub Pull Request descriptions, and AMD_MIGRATION_GUIDE.md reports.

πŸ—οΈ System Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      CODESENTRY FRONTEND                         β”‚
β”‚           React + Vite | Cyberpunk Terminal Aesthetic            β”‚
β”‚  LandingPage β†’ AnalysisView (SSE Live Feed) β†’ ReportView        β”‚
β”‚         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚
β”‚         β”‚ AMD MI300X Live   β”‚  β”‚ AMD Migration Advisor  β”‚       β”‚
β”‚         β”‚ Metrics Card      β”‚  β”‚ Panel + Score Circle   β”‚       β”‚
β”‚         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚  SSE (Server-Sent Events) + REST
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                       CODESENTRY BACKEND                         β”‚
β”‚                        FastAPI / Python                          β”‚
β”‚                                                                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  Security   β”‚  β”‚  Performance     β”‚  β”‚    Fix Agent       β”‚  β”‚
β”‚  β”‚  Agent      β”‚  β”‚  Agent           β”‚  β”‚ (patches + diffs)  β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚         β”‚          β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”            β”‚              β”‚
β”‚         β”‚          β”‚ AMD Migration  β”‚            β”‚              β”‚
β”‚         β”‚          β”‚ Advisor (10    β”‚            β”‚              β”‚
β”‚         β”‚          β”‚ CUDA patterns) β”‚            β”‚              β”‚
β”‚         β”‚          β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β”‚              β”‚
β”‚         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Ίβ”‚β—„β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β”‚
β”‚                     β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”                              β”‚
β”‚                     β”‚ Orchestratorβ”‚                              β”‚
β”‚                     β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜                              β”‚
β”‚                            β”‚                                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ Privacy Guard β”‚ Session Store β”‚ AMD Metrics β”‚ Code Parser β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                            β”‚                                     β”‚
β”‚                     β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”                              β”‚
β”‚                     β”‚  vLLM Serverβ”‚ (Qwen2.5-Coder-32B)         β”‚
β”‚                     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

The project is divided into two main components:

1. The Backend (/codesentry-backend)

A high-performance FastAPI server that acts as the orchestrator.

  • Ingests code via GitHub URLs, Hugging Face Spaces URLs, Zip files, or raw code snippets.
  • Manages the stateful analysis session and memory lifecycle.
  • Runs AMD MI300X live metrics polling via rocm-smi (with simulated fallback for dev environments).
  • Runs the AMD Migration Advisor to detect CUDA-specific patterns and calculate an AMD Compatibility Score.
  • Connects to an LLM endpoint (optimized for local deployment via vLLM on AMD hardware, using Qwen2.5-Coder-32B) to power the intelligent agents.

2. The Frontend (/codesentry-frontend)

A modern React + Vite dashboard built with a premium, cyberpunk-inspired terminal aesthetic.

  • Connects to the backend via SSE for live streaming.
  • Features the AMD MI300X Live Performance Card in the Analysis View β€” 6 GPU metrics updated every 2 seconds.
  • Features the AMD ROCm Migration Advisor Panel in the Report View β€” animated score circle, collapsible findings, and one-click AMD_MIGRATION_GUIDE.md export.
  • Dynamic data visualization, animated severity charts, and side-by-side Before/After code diffing for AI-generated fixes.

πŸ”΄ AMD-Specific Features

Live Hardware Metrics (Analysis View)

During every scan, CodeSentry polls the AMD MI300X GPU via rocm-smi and streams live metrics to the dashboard:

Metric Description
GPU Utilization Current compute load (%)
VRAM Used GB used / 192 GB total with visual bar
Memory Bandwidth TB/s data throughput
Temperature GPU edge temperature (Β°C)
Power Draw Current wattage consumption (W)
Inference Speed LLM tokens per second

On development machines without AMD hardware, the card displays realistic simulated values.

CUDA β†’ ROCm Migration Advisor (Report View)

The Migration Advisor scans code for 10 categories of CUDA-specific patterns:

ID Severity What It Detects
AMD_M01 Low torch.cuda.is_available() β€” CUDA device check
AMD_M02 Critical nvidia-smi β€” NVIDIA-only CLI tool
AMD_M03 High CUDA_VISIBLE_DEVICES β€” CUDA env variable
AMD_M04 High torch.cuda.amp.autocast/GradScaler β€” Legacy CUDA AMP
AMD_M05 Medium .half() / torch.float16 β€” FP16 suboptimal on MI300X
AMD_M06 Medium torch.backends.cudnn.* β€” cuDNN configuration
AMD_M07 High import flash_attn β€” CUDA-only Flash Attention
AMD_M08 Low torch.cuda.memory_allocated() β€” CUDA memory profiling
AMD_M09 Low device = 'cuda' β€” Hardcoded device string
AMD_M10 Critical BitsAndBytesConfig β€” CUDA-only quantization

Compatibility Scoring:

β‰₯ 90% β†’ "Fully ROCm Ready" (green)
β‰₯ 70% β†’ "Mostly Compatible" (yellow)  
β‰₯ 50% β†’ "Needs Migration Work" (orange)
< 50% β†’ "CUDA-Specific Codebase" (red)

πŸ’‘ How It Works (An Example Workflow)

To understand CodeSentry, imagine you have a Python scraping script that takes user input and feeds it into an LLM.

  1. Initiate Scan: You paste the GitHub or Hugging Face Space URL of the script into the CodeSentry dashboard.
  2. Live GPU Monitoring: The AMD MI300X Live Performance card immediately starts showing real-time GPU utilization, VRAM usage, temperature, and inference speed.
  3. Security Sweep: The Security Agent immediately flags cli.py:61 for a Prompt Injection (CWE-74) vulnerability because it detects raw user input being passed to the model without sanitization.
  4. Performance Sweep: The Performance Agent notices the code is loading a large transformer model inside a loop. It flags this and estimates you are wasting significant inference time.
  5. AMD Migration Scan: The Migration Advisor detects nvidia-smi calls and CUDA_VISIBLE_DEVICES usage, calculating an AMD Compatibility Score and suggesting rocm-smi and HIP_VISIBLE_DEVICES replacements.
  6. Fix Generation: The Fix Agent takes these findings and writes a patch. It refactors the prompt injection to use a parameterized template and hoists the model initialization outside the loop.
  7. Review: You view the dashboard. The findings are categorized by severity. You click on the Prompt Injection finding, and an AI-Generated Fix panel opens showing exactly what lines to change. The AMD Migration Panel shows your compatibility score with collapsible fix guidance.
  8. Export: You click "Copy PR Description" and paste a perfectly formatted summary of the fixes directly into your GitHub Pull Request. You also export the AMD_MIGRATION_GUIDE.md for your DevOps team.

πŸš€ Installation & Setup

Prerequisites

  • Node.js (v20.19+ or v22.12+)
  • Python (v3.10+)
  • An API Key for your LLM provider (e.g., Groq) if not running a completely local vLLM instance.

1. Backend Setup

Open a terminal and navigate to the backend directory:

cd codesentry-backend

# Create and activate a virtual environment
python -m venv venv
# On Windows:
venv\Scripts\activate
# On Mac/Linux:
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Configure Environment Variables
# Create a .env file based on the example and add your LLM_API_KEY
cp .env.example .env

# Run the backend server
uvicorn main:app --reload --port 8000

The backend will now be running on http://127.0.0.1:8000.

2. Frontend Setup

Open a second terminal and navigate to the frontend directory:

cd codesentry-frontend

# Install dependencies
npm install

# Ensure VITE_MOCK_MODE is set to false to connect to the live backend
echo "VITE_MOCK_MODE=false" > .env

# Run the development server
npm run dev

The dashboard will be available at http://127.0.0.1:5173.


βš™οΈ Environment Variables

Variable Default Description
VLLM_BASE_URL http://localhost:8080/v1 vLLM OpenAI-compatible endpoint
MODEL_NAME Qwen/Qwen2.5-Coder-32B-Instruct Model served by vLLM
USE_LLM true Set false for static-only mode (CI)
PORT 8000 CodeSentry API port
CORS_ORIGINS * Allowed frontend origins
ZDR_SIGNING_KEY (dev default) HMAC key for certificates β€” change in production
GROQ_API_KEY β€” Groq cloud API key (alternative to local vLLM)
VITE_MOCK_MODE false Frontend: use mock data instead of live backend
VITE_API_URL http://localhost:8000 Frontend: backend base URL

πŸ“Š SSE Event Types

Event Description
scan_started Scan session created, ID returned
agent_start An agent begins (security / performance / fix)
finding A security or performance vulnerability found
fix_ready A fix patch generated for a specific finding
amd_metrics Live AMD MI300X GPU metrics snapshot (every 2s)
amd_migration_finding A CUDA β†’ ROCm migration issue detected
amd_migration_summary Compatibility score and summary
complete Full analysis finished with summary + certificates
error An error occurred during analysis

πŸ“¦ Export Formats

Format Description
πŸ“„ JSON Report Machine-readable full report with all findings and fixes
πŸ“ SECURITY_REPORT.md Human-readable markdown security report
πŸ“‹ Copy PR Description GitHub Pull Request description copied to clipboard
πŸ”΄ AMD_MIGRATION_GUIDE.md AMD ROCm migration guide with score, findings, and fixes

πŸ” Built for the AMD Hackathon

CodeSentry was specifically designed to showcase the power of Agentic AI running on high-performance AMD MI300X compute hardware. By combining a suite of specialized agents with real-time GPU monitoring and CUDA β†’ ROCm migration guidance, we shift the paradigm of static code analysis from "reporting problems" to "actively writing solutions."

Zero Data Retention. 100% Agentic. AMD-Optimized. Enterprise Ready.