Spaces:
Paused
Paused
A newer version of the Gradio SDK is available:
6.1.0
metadata
title: MentorFlow
emoji: π
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.9.0
app_file: app.py
pinned: false
license: mit
hardware: gpu-t4
MentorFlow - Teacher-Student RL System
A meta-curriculum reinforcement learning system where an AI Teacher Agent learns to select optimal educational tasks to train an AI Student Agent.
π Features
- Three Training Strategies: Compare Random, Progressive, and Teacher-guided curriculum
- LM Student (DistilBERT): Real neural network learning with memory decay
- GPU Support: Fast training with CUDA acceleration
- Interactive Comparison: Visualize learning curves and performance metrics
π Usage
Set Parameters:
- Iterations: Number of training iterations (50-500)
- Seed: Random seed for reproducibility
- Device: Choose GPU (cuda) or CPU
Run Comparison:
- Click "Run Comparison" to start training
- Monitor progress in the output text
- View generated comparison plots
Analyze Results:
- Learning curves show how each strategy improves
- Difficult question performance shows final accuracy
- Curriculum diversity shows topic coverage
β‘ Performance
- With GPU: ~5-10 minutes for 500 iterations
- With CPU: ~15-30 minutes for 500 iterations
π Project Structure
MentorFlow/
βββ app.py # Gradio web interface
βββ teacher_agent_dev/ # Teacher agent system
β βββ compare_strategies.py # Main comparison script
β βββ teacher_agent.py # UCB bandit teacher
β βββ ...
βββ student_agent_dev/ # LM Student system
β βββ student_agent.py # DistilBERT student
β βββ ...
βββ requirements_hf.txt # Dependencies
π§ Technical Details
- Teacher Agent: UCB (Upper Confidence Bound) multi-armed bandit
- Student Agent: DistilBERT with online learning
- Memory Decay: Ebbinghaus forgetting curve
- Task Generator: Procedural generation with 15 topics Γ 7 difficulties
π More Information
See the main repository for detailed documentation and development guides.