namish10 commited on
Commit
24551d0
Β·
verified Β·
1 Parent(s): 6bd4a66

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +71 -12
README.md CHANGED
@@ -1,35 +1,94 @@
 
 
 
 
 
 
 
 
 
1
  # ContextFlow RL Doubt Predictor
2
 
3
- ## Overview
4
- This is the trained reinforcement learning model for ContextFlow doubt prediction system.
5
 
6
  ## Model Details
7
- - **Algorithm**: GRPO (Group Relative Policy Optimization) + Q-Learning
8
- - **State Dimension**: 64 features
9
- - **Action Dimension**: 10 doubt prediction actions
10
- - **Policy Version**: 50
11
- - **Training Samples**: 200
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
  ## Usage
 
14
  ```python
15
  import pickle
 
16
  from huggingface_hub import hf_hub_download
17
 
18
- # Download checkpoint
19
  path = hf_hub_download(repo_id='namish10/contextflow-rl', filename='checkpoint.pkl')
20
-
21
- # Load checkpoint
22
  with open(path, 'rb') as f:
23
  checkpoint = pickle.load(f)
24
 
25
- print(f"Policy version: {checkpoint.policy_version}")
 
 
 
 
 
 
 
26
  ```
27
 
28
  ## Citation
 
29
  ```bibtex
30
  @software{contextflow_rl,
31
  title={ContextFlow RL Doubt Predictor},
32
  author={ContextFlow Team},
33
- year={2026}
 
34
  }
35
  ```
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - reinforcement-learning
5
+ - education
6
+ - doubt-prediction
7
+ - q-learning
8
+ ---
9
+
10
  # ContextFlow RL Doubt Predictor
11
 
12
+ A reinforcement learning model that predicts when learners will get confused **before** it happens, using hand gesture recognition and privacy-first face blurring.
 
13
 
14
  ## Model Details
15
+
16
+ | Property | Value |
17
+ |----------|-------|
18
+ | **Algorithm** | GRPO + Q-Learning |
19
+ | **State Dimension** | 64 features |
20
+ | **Action Dimension** | 10 doubt predictions |
21
+ | **Policy Version** | 50 |
22
+ | **Training Samples** | 200 |
23
+ | **Framework** | PyTorch |
24
+
25
+ ## Architecture
26
+
27
+ ```
28
+ Q-Network: 64 β†’ 128 β†’ 128 β†’ 10
29
+ β”œβ”€β”€ State Encoder (64 features)
30
+ β”œβ”€β”€ Hidden Layer 1 (128 units, ReLU)
31
+ β”œβ”€β”€ Hidden Layer 2 (128 units, ReLU)
32
+ └── Output Layer (10 actions)
33
+ ```
34
+
35
+ ## Features (64-dimensional state vector)
36
+
37
+ The state vector encodes:
38
+ 1. **Topic Embedding** (32 dims) - TF-IDF representation of learning topic
39
+ 2. **Progress** (1 dim) - Session progress percentage
40
+ 3. **Confusion Signals** (16 dims) - Behavioral indicators:
41
+ - Mouse hesitation patterns
42
+ - Scroll reversals
43
+ - Time on page
44
+ - Eye tracking (if available)
45
+ 4. **Gesture Signals** (14 dims) - Hand gesture frequencies
46
+ 5. **Time Spent** (1 dim) - Total session time
47
+
48
+ ## Reward Function
49
+
50
+ The model optimizes for:
51
+ - **Correct doubt prediction**: +1.0
52
+ - **Helpful explanation provided**: +0.5
53
+ - **User engagement maintained**: +0.3
54
+ - **False positive**: -0.5
55
+ - **Missed confusion**: -1.0
56
 
57
  ## Usage
58
+
59
  ```python
60
  import pickle
61
+ import numpy as np
62
  from huggingface_hub import hf_hub_download
63
 
64
+ # Load model
65
  path = hf_hub_download(repo_id='namish10/contextflow-rl', filename='checkpoint.pkl')
 
 
66
  with open(path, 'rb') as f:
67
  checkpoint = pickle.load(f)
68
 
69
+ # Extract Q-network
70
+ q_weights = checkpoint.q_network_weights
71
+
72
+ # Create state vector (64 features)
73
+ state = np.random.randn(64)
74
+
75
+ # Predict doubt actions
76
+ # (Requires instantiating QNetwork class from train_rl.py)
77
  ```
78
 
79
  ## Citation
80
+
81
  ```bibtex
82
  @software{contextflow_rl,
83
  title={ContextFlow RL Doubt Predictor},
84
  author={ContextFlow Team},
85
+ year={2026},
86
+ url={https://github.com/contextflow}
87
  }
88
  ```
89
+
90
+ ## Limitations
91
+
92
+ - Trained on 200 synthetic samples (limited real-world data)
93
+ - Hand gesture recognition requires MediaPipe
94
+ - Privacy-first: face auto-blurred during gesture capture