Spaces:
Runtime error
Runtime error
A newer version of the Gradio SDK is available: 6.14.0
Operator Self-Improve Loop
Clean product one-liner: this is the smallest safe loop for improving the product shell without opening free-form repo drift.
Why: an operator should know how to ask for one useful change, review it, benchmark it, and keep the proof trail.
Pareto Frontier
| Option | Speed | Safety | Learning value | Real-world analog |
|---|---|---|---|---|
| edit files ad hoc | high | low | low | hallway fix with no ticket |
| propose one bounded manifest | high | high | high | editor assigns one scoped revision |
| broad autonomous rewrite | low | low | unclear | newsroom redesign during print deadline |
Layman read: the middle path is the product path. It is fast enough to use and constrained enough to trust.
The Loop
| Step | Operator move | Surface | Why it matters | Real-world analog |
|---|---|---|---|---|
| 1. Frame one gap | name one missing doc, config, or thin glue improvement | goal sentence | small asks are reviewable | assign one article revision |
| 2. Propose | rtk ./bin/bvtctl self-improve "<goal>" |
conversational front door | the system must suggest structured change before touching files | ask for a marked-up draft |
| 3. Review | inspect the proposed JSON, touched files, and benchmark | manifest + decision brief | you verify scope before execution opens | copy desk review |
| 4. Apply | rtk ./bin/bvtctl self-improve-apply "<goal>" |
bounded execution lane | the runtime writes only the approved files | publish the approved edit |
| 5. Verify | rtk ./bin/bvtctl benchmark |
benchmark surface | the shell should prove it still works after the edit | run press checks |
| 6. Promote or stop | keep the change only if the artifact is usable and the proof is honest | receipt + graph | improvement should compound from evidence, not vibes | archive only verified copy |
Guardrails
| Guardrail | Current rule | Why in plain English |
|---|---|---|
| touched files | maximum 3 |
small changes are easier to review and safer to benchmark |
| allowed actions | write_file only |
the runtime should change content directly, not improvise shell side effects |
| preferred roots | docs/, configs/, thin runtime/ or policy/ glue |
the product shell improves fastest at the boundary surfaces |
| benchmark | rtk ./bin/bvtctl benchmark |
every approved change should pay rent in product proof |
| proof | receipt required | if there is no durable trail, the loop did not really learn |
What Good Looks Like
| Acceptance bar | Plain-language test |
|---|---|
| recommendation is clear | the next operator can tell what changed and why |
| route is grounded | the change stays inside the allowed roots and manifest contract |
| proof is honest | the receipt and benchmark say what really happened |
| artifact is usable | the new doc, config, or thin glue helps the shipped shell immediately |
Good First Requests
rtk ./bin/bvtctl self-improve "add one missing operator doc for the self-improve loop"rtk ./bin/bvtctl self-improve "tighten one confusing runtime command explanation"rtk ./bin/bvtctl self-improve "add one benchmark note for operators"
Why: these are product-shell improvements. They sharpen the front door instead of dragging the runtime into a large rewrite.