naazimsnh02's picture
Updated Social Links
3e66404

A newer version of the Gradio SDK is available: 6.0.2

Upgrade
metadata
title: Legacy Code Modernizer - Autonomous AI Agent
emoji: πŸ€–
colorFrom: purple
colorTo: blue
sdk: gradio
sdk_version: 6.0.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: Autonomous AI agent for code modernization with MCP tools
tags:
  - mcp-in-action-track-enterprise
  - mcp-in-action-track-consumer
  - code-modernization
  - autonomous-agent
  - mcp
  - gradio
  - gemini
  - modal
  - llama-index
  - nebius
  - chromadb

πŸ€– Legacy Code Modernizer - Autonomous AI Agent

Track 2: MCP in Action - Enterprise Applications

An autonomous AI agent that modernizes legacy codebases through intelligent planning, reasoning, and execution using Model Context Protocol (MCP) tools.

🎯 Project Overview

Legacy Code Modernizer is a complete autonomous agent system that transforms outdated code into modern, secure, and maintainable software. The agent autonomously:

  1. Plans - Analyzes codebases and creates modernization strategies
  2. Reasons - Makes intelligent decisions about transformation priorities
  3. Executes - Applies transformations, generates tests, and validates changes
  4. Integrates - Creates GitHub PRs with comprehensive documentation

πŸ† Why This Project Stands Out

Autonomous Agent Capabilities

Multi-Phase Planning & Reasoning:

  • Phase 1: Intelligent file discovery and classification using AI pattern detection
  • Phase 2: Semantic code analysis with vector-based similarity search (LlamaIndex + Chroma)
  • Phase 3: Deep pattern analysis using multiple AI models (Gemini, Nebius AI)
  • Phase 4: Autonomous code transformation with context-aware reasoning
  • Phase 5: Automated testing in isolated sandbox + GitHub PR creation

Context Engineering & RAG:

  • Vector embeddings for semantic code search
  • Pattern grouping across similar files
  • Historical transformation caching via MCP Memory
  • Real-time migration guide retrieval via MCP Search

MCP Tools Integration

The agent uses 4 MCP servers as autonomous tools:

  1. GitHub MCP - Autonomous PR creation with comprehensive documentation
  2. Tavily Search MCP - Real-time migration guide discovery
  3. Memory MCP - Pattern analysis caching and learning
  4. Filesystem MCP - Safe file operations (planned)

Real-World Enterprise Value

  • Multi-language support: Python, Java, JavaScript, TypeScript
  • Secure execution: Modal sandbox with isolated test environments
  • Production-ready: Comprehensive test generation with coverage reporting

πŸš€ Demo

Video Demo

Demo video

Social Media Post

Post on X

🎬 Quick Start

Try It Live on Hugging Face Spaces

  1. Upload a code file (Python, Java, JavaScript, TypeScript)
  2. Select target version (auto-detected from your code)
  3. Click "Start Modernization"
  4. Watch the autonomous agent work through all 5 phases
  5. Download modernized code, tests, and reports

Local Installation

# Clone repository
git clone https://huggingface.co/spaces/MCP-1st-Birthday/legacy_code_modernizer
cd legacy_code_modernizer

# Set up environment variables
cp .env.example .env
# Edit .env with your API keys:
# - GEMINI_API_KEY (required)
# - GITHUB_TOKEN (for PR creation)
# - TAVILY_API_KEY (for search)
# - MODAL_TOKEN_ID & MODAL_TOKEN_SECRET (for sandbox)

# Set up Python virtual environment
#   On macOS / Linux:
source venv/bin/activate
#   On Windows PowerShell:
.\venv\Scripts\Activate.ps1
#   On Windows CMD:
venv\Scripts\activate.bat

# Install dependencies
pip install -r requirements.txt

# Run the Gradio app
python app.py

🧠 Autonomous Agent Architecture

Planning Phase

Input: Legacy codebase
↓
Agent analyzes file structure and content
↓
Classifies files by modernization priority
↓
Creates transformation roadmap

Reasoning Phase

Agent groups similar patterns using vector search
↓
Retrieves migration guides via Tavily MCP
↓
Checks cached analyses via Memory MCP
↓
Prioritizes transformations by risk/impact

Execution Phase

Agent transforms code with AI models
↓
Generates comprehensive test suites
↓
Validates in isolated Modal sandbox
↓
Auto-fixes export/import issues

Integration Phase

Agent creates GitHub branch via GitHub MCP
↓
Commits transformed files
↓
Generates PR with deployment checklist
↓
Adds rollback plan and test results

πŸ› οΈ Technical Stack

AI & LLM

  • Google Gemini - Primary reasoning engine with large context window
  • Nebius AI - Alternative model for diverse perspectives
  • LlamaIndex - RAG framework for semantic code search
  • Chroma - Vector database for embeddings
  • bge-large-en - Embedding model deployed on Modal for inference

MCP Integration

  • mcp (v1.22.0) - Model Context Protocol SDK
  • @modelcontextprotocol/server-github - GitHub operations
  • @modelcontextprotocol/server-tavily - Web search
  • @modelcontextprotocol/server-memory - Persistent storage

Execution & Testing

  • Modal - Serverless sandbox for secure test execution
  • pytest/Jest/JUnit - Language-specific test frameworks
  • Coverage.py/JaCoCo - Code coverage analysis

UI & Orchestration

  • Gradio 6.0 - Interactive web interface
  • LangGraph - Agent workflow orchestration
  • asyncio - Asynchronous execution

πŸ“Š Features Showcase

1. Intelligent Pattern Detection

# Agent automatically detects legacy patterns:
- Deprecated libraries (MySQLdb β†’ PyMySQL)
- Security vulnerabilities (SQL injection)
- Python 2 syntax β†’ Python 3
- Missing type hints
- Old-style string formatting

2. Semantic Code Search

# Vector-based similarity search finds:
- Files with similar legacy patterns
- Related security vulnerabilities
- Common refactoring opportunities

3. Autonomous Test Generation

# Agent generates:
- Unit tests with pytest/Jest/JUnit
- Integration tests
- Edge case coverage
- Performance benchmarks

4. GitHub Integration via MCP

# Automated PR includes:
- Comprehensive change summary
- Test results with coverage
- Risk assessment
- Deployment checklist
- Rollback plan

🎯 Supported Languages & Versions

Python

  • Versions: 3.10, 3.11, 3.12, 3.13, 3.14
  • Frameworks: Django 5.2 LTS, Flask 3.1, FastAPI 0.122
  • Testing: pytest with coverage

Java

  • Versions: Java 17 LTS, 21 LTS, 23, 25 LTS
  • Frameworks: Spring Boot 3.4, 4.0
  • Testing: Maven + JUnit 5 + JaCoCo

JavaScript

  • Standards: ES2024, ES2025
  • Runtimes: Node.js 22 LTS, 24 LTS, 25
  • Frameworks: React 19, Angular 21, Vue 3.5, Express 5.1, Next.js 16
  • Testing: Jest with coverage

TypeScript

  • Versions: 5.6, 5.7, 5.8, 5.9
  • Frameworks: React 19, Angular 21, Next.js 16
  • Testing: Jest with ts-jest

πŸ”’ Security & Isolation

Modal Sandbox Execution

  • Network isolation: No external network access during tests
  • Filesystem isolation: Temporary containers per execution
  • Resource limits: CPU and memory constraints
  • Automatic cleanup: Containers destroyed after execution

Code Validation

  • Syntax checking: Pre-execution validation
  • Import/export fixing: Automatic resolution of module issues
  • Security scanning: Detection of vulnerabilities
  • Type checking: Language-specific validation

πŸŽ“ Advanced Features

Context Engineering

  • Sliding window context: Manages large files efficiently
  • Cross-file analysis: Understands dependencies
  • Pattern learning: Improves with usage via Memory MCP

RAG Implementation

  • Semantic chunking: Intelligent code splitting
  • Vector similarity: Finds related patterns
  • Hybrid search: Combines keyword + semantic search

Agent Reasoning

  • Priority scoring: Risk vs. impact analysis
  • Dependency tracking: Understands file relationships

πŸ“ License

Apache 2.0 - See LICENSE file for details

πŸ™ Acknowledgments

Built for MCP's 1st Birthday Hackathon hosted by Anthropic and Gradio.

Powered by:

  • Google Gemini & Nebius AI
  • Model Context Protocol (MCP)
  • LlamaIndex & Chroma
  • Modal
  • Gradio

Autonomous agents + MCP tools = The future of software development