StepWise-Math-AI / README.md
DreamyDetective's picture
Upload 2 files
5680735 verified
metadata
title: StepWise Math AI
emoji: πŸŽ“
colorFrom: gray
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: mit
tags:
  - building-mcp-track-consumer
  - building-mcp-track-creative
  - mcp
  - gradio
  - gemini
  - education
  - mathematics
  - ai
  - visualization
  - interactive-learning

StepWise Math

Transform Static Math Problems into Living, Interactive Step-by-Step Visual Proofs

This is a Gradio MCP Framework implementation of the StepWise Math React app, providing the same powerful features in a Python-based web interface.

MCP's 1st Birthday Hackathon Track 1: Building MCP Powered by Google Gemini Powered by Google Gemini

Overview

What This Project Does

StepWise Math is an MCP-capable service that transforms static math problems into interactive, step-by-step visual proofs. Built as a Gradio-based MCP server, it provides both a user-friendly web interface and programmatic MCP endpoints for AI agents and developer tools.

The system operates through a two-stage AI pipeline:

  • Stage 1 β€” Concept Analysis (Gemini 2.5 Flash): Generates a pedagogical JSON specification from text, URL, or image input
  • Stage 2 β€” Code Generation (Gemini 3.0 Pro): Synthesizes a self-contained HTML/JS interactive proof application

Why This Project Matters

  • Educational Impact: Empowers teachers and students to visualize mathematical reasoning step-by-step, transforming abstract concepts into concrete, interactive experiences
  • MCP Showcase: Demonstrates best practices for building MCP servers that integrate multi-step LLM workflows, streaming thoughts, and developer-facing prompts/resources
  • Reference Implementation: Provides a complete example of combining AI-powered analysis with code generation in an MCP-compliant architecture

Documentation

For complete product specifications, feature requirements, and technical implementation details, see the Product Requirements Document (PRD.md).

The PRD covers:

  • Target audience and user personas (Grades 6-10 students, teachers, tutors)
  • Detailed functional requirements and data models
  • UI/UX design specifications
  • Example use cases (Pythagorean Theorem, Slope-Intercept Form)
  • System constraints and technical architecture

Quick Start

Using Gradio UI

  1. Enter your Gemini API Key in the Configuration section (get one free at ai.google.dev). This is needed only when the embedded API key is out of credits.
  2. Choose your input method (Text, Image, or URL)
  3. Describe a math problem or concept
  4. Click Generate Guided Proof
  5. Explore the interactive visualization!

Using MCP Clients

  1. Point your MCP client (e.g., Claude Desktop, VSCode) to the deployed MCP server URL: https://mcp-1st-birthday-stepwise-math-ai.hf.space/gradio_api/mcp/

  2. Configure the MCP server settings in your client as follows:

    Claude Desktop (claude_desktop_config.json)
    {
      "mcpServers": {
        "stepwise": {
          "command": "npx",
          "args": [
            "mcp-remote",
            "https://dreamydetective-stepwise-math-mcp-server-1.hf.space/gradio_api/mcp/",
            "--transport",
            "streamable-http"
          ]
        }
      }
    }
    

    After updating, restart Claude Desktop.

    VSCode (settings.json)
    {
      "servers": {
        "stepwise": {
          "url": "https://mcp-1st-birthday-stepwise-math-ai.hf.space/gradio_api/mcp/",
          "type": "http"
        }
      }
    }
    
  1. Open your MCP client and discover the available prompts and tools.
  2. Invoke the create_visual_math_proof prompt or the underlying tools to generate interactive proofs programmatically.

Features

MCP Tools, Prompts & Resources

  • Prompts: High-level conversational wrappers
  • Tools: Programmatic calls
  • Resources: JSON templates and examples
  • Discovery: All prompts/tools are registered in the server schema for MCP client access
  • Authentication: Use API keys (GEMINI_API_KEY)

Multi-Modal Input

  • Text Input: Describe any math problem in natural language
  • Image Upload: Upload photos of textbook problems or handwritten equations
  • URL Import: Reference YouTube videos, Khan Academy lessons, or web resources

Dual-Stage AI Pipeline

  • Stage 1 - The Teacher (Gemini 2.5 Flash): Analyzes concepts and designs pedagogical step sequences
  • Stage 2 - The Engineer (Gemini 3.0 Pro + Extended Thinking): Generates production-ready interactive visualizations

Interactive Step Navigation

  • Progressive disclosure of mathematical concepts
  • Back/Forward buttons to review at your own pace
  • Visual state changes synchronized with each step
  • Real-time equation updates as you interact

System Architecture

The system utilizes a Two-Stage AI Pipeline orchestrated by a Python/Gradio core to transform abstract math concepts into interactive HTML5 applications.

  1. Ingestion (MCP & Web): Users submit requests via MCP-enabled clients (Claude Desktop, VSCode) or the direct Web UI.
  2. Stage 1 - Analysis (Gemini 2.5 Flash): The "Architect" model decomposes the mathematical concept into a structured MathSpec JSON, defining learning steps and visual logic.
  3. Stage 2 - Implementation (Gemini 3.0 Pro): The "Builder" model consumes the spec to generate self-contained, interactive HTML5/Canvas code.
  4. Delivery: The final executable app is rendered in the UI or returned to the MCP client for immediate use.

architecture

What's Included:

  • Functioning MCP server exposing tools for creating math specifications and building interactive proofs
  • Gradio UI (app.py) for submitting text, URL, or image inputs and viewing generated proofs
  • MCP prompts and resources accessible to MCP clients (Claude Desktop, VSCode, etc.)

Usage Guide

Generating Your First Proof

  1. Select Input Method: Choose Text, Image, or URL
  2. Provide Your Problem:
  3. Click Generate: The AI will analyze and create your interactive proof
  4. Explore: Navigate through the steps in the Guided Proof tab

Refining Your Proof

  1. View the Generated Proof: Check the interactive simulation
  2. Provide Feedback: Type suggestions like "Make the triangle larger" or "Add labels to the vertices"
  3. Apply Refinement: The AI will regenerate with your feedback

Technical Details

MCP Integration Architecture & Programmatic Flow

The diagram illustrates how the Gradio application exposes its functionality for programmatic access via the Model Context Protocol (MCP). The Gradio App Server acts as the central hub, exposing capabilities in two categories:

  • MCP Prompts: High-level, conversational wrappers (e.g., create_visual_math_proof) for guiding multi-step workflows.
  • MCP Resources / Tools: Direct, callable functions (e.g., create_math_specification_from_text, build_from_specification) for specific tasks.

MCP-aware clients or agents can interact with these components by first discovering the available tools through the server schema and then authenticating with an API key.

The bottom section details a typical programmatic flow:

  1. Discover available prompts and resources.
  2. Call a create_math_specification_from_* tool with the appropriate input (Text, URL, or Image) to generate a structured MathSpec JSON.
  3. Call the build_interactive_proof_from_specification tool with the generated JSON to produce the final, self-contained index.html interactive proof.

programmatic-flow

Note: All data is processed securely. Your API key is only used to communicate with Google's Gemini API.

Resources & Links

Live Demo & Demo Video

Social Media

Read the announcement and join the discussion:

Judging Criteria

This submission addresses all hackathon requirements for Track 1 (Building MCP) as follows:

  • Completion:
  • UI/UX Polish: Dark-themed interface with auto-loaded examples, collapsible accordions for advanced features, and responsive iframe rendering for seamless proof exploration
  • Functionality: MCP server at /gradio_api/mcp/ exposing 4 tools (create_math_specification_from_text/url/image, build_interactive_proof_from_specification), 3 prompts and 4 resources for programmatic access from Claude Desktop, VSCode, and other MCP clients
  • Creativity: Two-stage AI pipeline (Stage 1: Gemini 2.5 Flash concept analysis + Stage 2: Gemini 3.0 Pro code generation), multi-modal input (text/URL/image with OCR), streaming thought processes, and interactive HTML5/Canvas visualizations
  • Documentation: Comprehensive README.md, PRD.md with technical specs, demo video, and inline code documentation

Contributing & Support

Contributions Welcome! Areas of interest:

  • Performance optimizations
  • UI/UX improvements
  • Additional mathematical domains
  • Documentation and examples

Support Channels:

Acknowledgments

  • Google AI for the incredible Gemini models
  • Gradio for the amazing Python web framework
  • Nano Banana for image assets
  • GitHub Copilot for Vibe Coding

License

This project is licensed under the MIT License. Original work created for the MCP's 1st Birthday Hackathon (November 2025).

Built with ❀️ for visual learners everywhere

Making abstract math concrete, one proof at a time

Like this space if you found it helpful!