bit-vector-tensor-control-policy / docs /operator_self_improve_loop.md
J94's picture
Initial Space upload
3436bdd verified

A newer version of the Gradio SDK is available: 6.14.0

Upgrade

Operator Self-Improve Loop

Clean product one-liner: this is the smallest safe loop for improving the product shell without opening free-form repo drift.

Why: an operator should know how to ask for one useful change, review it, benchmark it, and keep the proof trail.

Pareto Frontier

Option Speed Safety Learning value Real-world analog
edit files ad hoc high low low hallway fix with no ticket
propose one bounded manifest high high high editor assigns one scoped revision
broad autonomous rewrite low low unclear newsroom redesign during print deadline

Layman read: the middle path is the product path. It is fast enough to use and constrained enough to trust.

The Loop

Step Operator move Surface Why it matters Real-world analog
1. Frame one gap name one missing doc, config, or thin glue improvement goal sentence small asks are reviewable assign one article revision
2. Propose rtk ./bin/bvtctl self-improve "<goal>" conversational front door the system must suggest structured change before touching files ask for a marked-up draft
3. Review inspect the proposed JSON, touched files, and benchmark manifest + decision brief you verify scope before execution opens copy desk review
4. Apply rtk ./bin/bvtctl self-improve-apply "<goal>" bounded execution lane the runtime writes only the approved files publish the approved edit
5. Verify rtk ./bin/bvtctl benchmark benchmark surface the shell should prove it still works after the edit run press checks
6. Promote or stop keep the change only if the artifact is usable and the proof is honest receipt + graph improvement should compound from evidence, not vibes archive only verified copy

Guardrails

Guardrail Current rule Why in plain English
touched files maximum 3 small changes are easier to review and safer to benchmark
allowed actions write_file only the runtime should change content directly, not improvise shell side effects
preferred roots docs/, configs/, thin runtime/ or policy/ glue the product shell improves fastest at the boundary surfaces
benchmark rtk ./bin/bvtctl benchmark every approved change should pay rent in product proof
proof receipt required if there is no durable trail, the loop did not really learn

What Good Looks Like

Acceptance bar Plain-language test
recommendation is clear the next operator can tell what changed and why
route is grounded the change stays inside the allowed roots and manifest contract
proof is honest the receipt and benchmark say what really happened
artifact is usable the new doc, config, or thin glue helps the shipped shell immediately

Good First Requests

  • rtk ./bin/bvtctl self-improve "add one missing operator doc for the self-improve loop"
  • rtk ./bin/bvtctl self-improve "tighten one confusing runtime command explanation"
  • rtk ./bin/bvtctl self-improve "add one benchmark note for operators"

Why: these are product-shell improvements. They sharpen the front door instead of dragging the runtime into a large rewrite.