OGrohit commited on
Commit
2fee0ff
·
verified ·
1 Parent(s): 9979948

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -0
README.md CHANGED
@@ -144,6 +144,29 @@ Dense, shaped signal across the full trajectory — not just binary win/lose:
144
  - **Episodes:** 50 per task (150 total)
145
  - **Hardware:** NVIDIA T4 GPU (Colab)
146
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
147
  ### Results
148
 
149
  | Task | First 10 Episodes | Last 10 Episodes | Improvement | Status |
@@ -293,6 +316,37 @@ python train.py \
293
  - [x] `/grader` endpoint
294
  - [x] HF Space deployed and healthy
295
  - [x] Baseline inference script
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
296
 
297
  ---
298
 
 
144
  - **Episodes:** 50 per task (150 total)
145
  - **Hardware:** NVIDIA T4 GPU (Colab)
146
 
147
+ ### Experimental Tracking
148
+
149
+ Training results are automatically logged and saved to verify the training actually happened:
150
+
151
+ - **`./logs/{task}_results.csv`** — Per-episode rewards and step counts (updated live during training)
152
+ ```
153
+ episode,reward,steps
154
+ 1,+0.255,8
155
+ 2,+0.240,7
156
+ 3,+0.290,6
157
+ ...
158
+ ```
159
+ - **`./phase2_checkpoints/{task}_ep*.json`** — Checkpoint data at episodes 25, 50, 75, etc.
160
+
161
+ **To verify training results after running:**
162
+ ```bash
163
+ # Check CSV files exist and contain data
164
+ head ./logs/cascading_failure_results.csv
165
+
166
+ # Plot results yourself:
167
+ python -c "import pandas as pd; pd.read_csv('./logs/cascading_failure_results.csv').plot()"
168
+ ```
169
+
170
  ### Results
171
 
172
  | Task | First 10 Episodes | Last 10 Episodes | Improvement | Status |
 
316
  - [x] `/grader` endpoint
317
  - [x] HF Space deployed and healthy
318
  - [x] Baseline inference script
319
+ - [x] Experimental tracking (CSV + checkpoints)
320
+
321
+ ## Verifying Training Execution
322
+
323
+ **For judges to verify training actually happened:**
324
+
325
+ ```bash
326
+ # 1. Check CSV log files exist
327
+ ls -lh ./logs/
328
+
329
+ # 2. View a sample of episode results
330
+ head -20 ./logs/cascading_failure_results.csv
331
+
332
+ # 3. Check checkpoint files exist
333
+ ls -lh ./phase2_checkpoints/
334
+
335
+ # 4. Plot training curves from CSV
336
+ python -c "
337
+ import pandas as pd
338
+ import matplotlib.pyplot as plt
339
+
340
+ df = pd.read_csv('./logs/cascading_failure_results.csv')
341
+ plt.figure(figsize=(10, 6))
342
+ plt.plot(df['episode'], df['reward'].astype(float))
343
+ plt.xlabel('Episode')
344
+ plt.ylabel('Reward')
345
+ plt.title('Cascading Failure Task - GRPO Training')
346
+ plt.savefig('verification_curve.png')
347
+ print('✓ Verification curve saved')
348
+ "
349
+ ```
350
 
351
  ---
352