metadata
license: apache-2.0
tags:
- reinforcement-learning
- education
- doubt-prediction
- adaptive-learning
- multi-agent-systems
- gesture-recognition
- computer-vision
- q-learning
- grpo
- edtech
- mediapipe
- privacy
datasets:
- synthetic-learning-interactions
ContextFlow: Predictive Doubt Detection in Adaptive Learning Systems
A Research Implementation of RL-Powered Educational Technology
| Property | Value |
|---|---|
| Algorithm | GRPO + Q-Learning |
| State Dimension | 64 features |
| Action Dimension | 10 doubt predictions |
| Policy Version | 50 |
| Training Samples | 200 |
| Final Loss | 0.2465 |
| Avg Reward | 0.75 |
Overview
ContextFlow predicts student confusion before it occurs using reinforcement learning and behavioral signal analysis. When a learner's actions suggest they might be struggling (mouse hesitation, scroll reversals, help-seeking gestures), the system proactively offers assistance.
Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β 9 Specialized Agents β
βββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β’ StudyOrchestrator β’ DoubtPredictorAgent β
β β’ BehavioralAgent β’ HandGestureAgent β
β β’ RecallAgent β’ KnowledgeGraphAgent β
β β’ PeerLearningAgent β’ LLMOrchestrator β
β β’ GestureActionMapper β’ PromptAgent β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
Quick Start
# Load the model
from huggingface_hub import hf_hub_download
import pickle
path = hf_hub_download(
repo_id='namish10/contextflow-rl',
filename='checkpoint.pkl'
)
with open(path, 'rb') as f:
checkpoint = pickle.load(f)
print(f"Policy version: {checkpoint.policy_version}")
print(f"Training samples: {checkpoint.training_stats['total_samples']}")
State Vector (64 dimensions)
| Component | Dims | Description |
|---|---|---|
| Topic Embedding | 32 | TF-IDF of learning topic |
| Progress | 1 | Session progress (0.0-1.0) |
| Confusion Signals | 16 | Behavioral indicators |
| Gesture Signals | 14 | Hand gesture frequencies |
| Time Spent | 1 | Normalized session time |
Actions (10 doubt predictions)
what_is_backpropagationwhy_gradient_descenthow_overfitting_worksexplain_regularizationwhat_loss_functionhow_optimization_worksexplain_learning_ratewhat_regularizationhow_batch_norm_worksexplain_softmax
Training Results
| Epoch | Loss | Epsilon | Avg Reward |
|---|---|---|---|
| 1 | 1.2456 | 1.000 | 0.20 |
| 2 | 0.8923 | 0.995 | 0.35 |
| 3 | 0.6541 | 0.990 | 0.48 |
| 4 | 0.4127 | 0.985 | 0.62 |
| 5 | 0.2465 | 0.980 | 0.75 |
Key Features
- Predictive Detection: RL-based confusion prediction before it happens
- Multi-Agent Orchestration: 9 specialized agents working together
- Gesture Recognition: Privacy-first hand gesture detection with MediaPipe
- Face Blurring: Real-time face blur for classroom deployment
- Browser AI Launch: Direct AI chat interface from predicted doubts
- Spaced Repetition: SM-2 based review scheduling
- Knowledge Graphs: Concept mapping and learning paths
Files
| File | Description |
|---|---|
checkpoint.pkl |
Trained Q-network weights |
train_rl.py |
Training script with GRPO |
feature_extractor.py |
64-dim state extraction |
inference_example.py |
Usage examples |
demo.ipynb |
Interactive notebook |
RESEARCH_PAPER.md |
Full research paper |
evaluation_results.json |
Training metrics |
requirements.txt |
Dependencies |
app/ |
Backend agents (Flask API) |
frontend/ |
React frontend |
Evaluation
See EVALUATION.md for detailed metrics and production readiness assessment.
Citation
@software{contextflow,
title={ContextFlow: Predictive Doubt Detection in Adaptive Learning Systems},
author={ContextFlow Team},
year={2026},
version={1.0},
url={https://huggingface.co/namish10/contextflow-rl}
}