modelbuilderhq commited on
Commit
8c627b1
·
verified ·
1 Parent(s): 160c47d

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +4 -6
README.md CHANGED
@@ -36,9 +36,7 @@ The agent gets a dense plain-text briefing, takes one structured action, and is
36
  - **Trainable reward signal**: dense step reward for learning plus bounded graders for evaluation.
37
  - **Hackathon fit**: fully OpenEnv-packaged, hostable on HF Spaces, with reproducible training and visible before/after evidence.
38
 
39
- ## Judging-Criteria Mapping
40
-
41
- ### 1) Environment Innovation (40%)
42
 
43
  - The observation is a realistic text briefing, not a toy tabular state dump.
44
  - Actions are schema-bound (`GhostexecAction`) and validated against live world ids.
@@ -53,7 +51,7 @@ The agent gets a dense plain-text briefing, takes one structured action, and is
53
  | `monday_morning` | medium | `scenarios/monday_morning.json` |
54
  | `dinner_disaster` | hard | `scenarios/dinner_disaster.json` |
55
 
56
- ### 2) Storytelling and Presentation (30%)
57
 
58
  Ghostexec tells a familiar high-stakes story: too many urgent asks, not enough time, and every action has social + operational consequences.
59
 
@@ -62,7 +60,7 @@ The demo is easy to follow:
62
  2. compare weak vs better action choice,
63
  3. show reward movement and policy behavior improvements.
64
 
65
- ### 3) Showing Improvement in Rewards (20%)
66
 
67
  The repo includes persisted training artifacts and plot outputs:
68
 
@@ -89,7 +87,7 @@ The repo includes persisted training artifacts and plot outputs:
89
  | Invalid action rate | `Not logged in saved artifacts` | `Not logged in saved artifacts` |
90
  | Grader score | `Not logged in saved artifacts` | `Not logged in saved artifacts` |
91
 
92
- ### 4) Reward and Training Pipeline (10%)
93
 
94
  Ghostexec uses a coherent weighted reward core plus bounded shaping:
95
 
 
36
  - **Trainable reward signal**: dense step reward for learning plus bounded graders for evaluation.
37
  - **Hackathon fit**: fully OpenEnv-packaged, hostable on HF Spaces, with reproducible training and visible before/after evidence.
38
 
39
+ ### 1) Our Inovation
 
 
40
 
41
  - The observation is a realistic text briefing, not a toy tabular state dump.
42
  - Actions are schema-bound (`GhostexecAction`) and validated against live world ids.
 
51
  | `monday_morning` | medium | `scenarios/monday_morning.json` |
52
  | `dinner_disaster` | hard | `scenarios/dinner_disaster.json` |
53
 
54
+ ### 2) Overview
55
 
56
  Ghostexec tells a familiar high-stakes story: too many urgent asks, not enough time, and every action has social + operational consequences.
57
 
 
60
  2. compare weak vs better action choice,
61
  3. show reward movement and policy behavior improvements.
62
 
63
+ ### 3) Improvement in Rewards
64
 
65
  The repo includes persisted training artifacts and plot outputs:
66
 
 
87
  | Invalid action rate | `Not logged in saved artifacts` | `Not logged in saved artifacts` |
88
  | Grader score | `Not logged in saved artifacts` | `Not logged in saved artifacts` |
89
 
90
+ ### 4) Reward and Training Pipeline
91
 
92
  Ghostexec uses a coherent weighted reward core plus bounded shaping:
93