MentorFlow / README_HF_SPACE.md
Cornelius
Upgrade to Gradio 5.9.0 and add monkey-patch to fix schema bug
c775d45

A newer version of the Gradio SDK is available: 6.1.0

Upgrade
metadata
title: MentorFlow
emoji: πŸŽ“
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.9.0
app_file: app.py
pinned: false
license: mit
hardware: gpu-t4

MentorFlow - Teacher-Student RL System

A meta-curriculum reinforcement learning system where an AI Teacher Agent learns to select optimal educational tasks to train an AI Student Agent.

πŸš€ Features

  • Three Training Strategies: Compare Random, Progressive, and Teacher-guided curriculum
  • LM Student (DistilBERT): Real neural network learning with memory decay
  • GPU Support: Fast training with CUDA acceleration
  • Interactive Comparison: Visualize learning curves and performance metrics

πŸ“Š Usage

  1. Set Parameters:

    • Iterations: Number of training iterations (50-500)
    • Seed: Random seed for reproducibility
    • Device: Choose GPU (cuda) or CPU
  2. Run Comparison:

    • Click "Run Comparison" to start training
    • Monitor progress in the output text
    • View generated comparison plots
  3. Analyze Results:

    • Learning curves show how each strategy improves
    • Difficult question performance shows final accuracy
    • Curriculum diversity shows topic coverage

⚑ Performance

  • With GPU: ~5-10 minutes for 500 iterations
  • With CPU: ~15-30 minutes for 500 iterations

πŸ“ Project Structure

MentorFlow/
β”œβ”€β”€ app.py                      # Gradio web interface
β”œβ”€β”€ teacher_agent_dev/          # Teacher agent system
β”‚   β”œβ”€β”€ compare_strategies.py  # Main comparison script
β”‚   β”œβ”€β”€ teacher_agent.py       # UCB bandit teacher
β”‚   └── ...
β”œβ”€β”€ student_agent_dev/          # LM Student system
β”‚   β”œβ”€β”€ student_agent.py       # DistilBERT student
β”‚   └── ...
└── requirements_hf.txt        # Dependencies

πŸ”§ Technical Details

  • Teacher Agent: UCB (Upper Confidence Bound) multi-armed bandit
  • Student Agent: DistilBERT with online learning
  • Memory Decay: Ebbinghaus forgetting curve
  • Task Generator: Procedural generation with 15 topics Γ— 7 difficulties

πŸ“– More Information

See the main repository for detailed documentation and development guides.