shank commited on
Commit ·
3eb8edc
1
Parent(s): c02b65b
fix: use GitHub raw URLs for images so README renders on HF Space
Browse files
README.md
CHANGED
|
@@ -54,16 +54,16 @@ Our training run clearly demonstrates rapid policy adaptation. The model success
|
|
| 54 |
|
| 55 |
*(Note for Hackathon Judges: Live Weights & Biases charts and Gradio UI are embedded below as evidence of the training run).*
|
| 56 |
|
| 57 |
-

|
| 58 |
-

|
| 59 |
|
| 60 |
*Additional Training Metrics:*
|
| 61 |
<p align="center">
|
| 62 |
-
<img src="images/hypothesis_quality.png" width="48%" />
|
| 63 |
-
<img src="images/semantic.png" width="48%" />
|
| 64 |
</p>
|
| 65 |
|
| 66 |
-

|
| 67 |
|
| 68 |
* **Format Compliance:** Scaled to 1.0 (max) within 50 steps.
|
| 69 |
* **Total Reward:** Scaled from baseline ~0.4 to peaks of ~1.0 by step 250.
|
|
|
|
| 54 |
|
| 55 |
*(Note for Hackathon Judges: Live Weights & Biases charts and Gradio UI are embedded below as evidence of the training run).*
|
| 56 |
|
| 57 |
+

|
| 58 |
+

|
| 59 |
|
| 60 |
*Additional Training Metrics:*
|
| 61 |
<p align="center">
|
| 62 |
+
<img src="https://raw.githubusercontent.com/shasshaank/AgentDebuggerEnv/main/images/hypothesis_quality.png" width="48%" />
|
| 63 |
+
<img src="https://raw.githubusercontent.com/shasshaank/AgentDebuggerEnv/main/images/semantic.png" width="48%" />
|
| 64 |
</p>
|
| 65 |
|
| 66 |
+

|
| 67 |
|
| 68 |
* **Format Compliance:** Scaled to 1.0 (max) within 50 steps.
|
| 69 |
* **Total Reward:** Scaled from baseline ~0.4 to peaks of ~1.0 by step 250.
|