diff --git a/ROADMAP.md b/ROADMAP.md index f0bc6b5..d0fd11b 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -422,3 +422,23 @@ to: 4. Update `config_validate` schema with the new field. **Action item:** Wire `RuntimeConfig::trusted_roots()` → `WorkerRegistry::spawn_worker()` default. Cover with test: config with `trusted_roots = ["/tmp"]` → spawning worker in `/tmp/x` auto-resolves trust without caller passing the field. + +## Observability Transport Decision (filed 2026-04-08) + +### Canonical state surface: CLI/file-based. HTTP endpoint deferred. + +**Decision:** `claw state` reading `.claw/worker-state.json` is the **blessed observability contract** for clawhip and downstream tooling. This is not a stepping-stone — it is the supported surface. Build against it. + +**Rationale:** +- claw-code is a plugin running inside the opencode binary. It cannot add HTTP routes to `opencode serve` — that server belongs to upstream sst/opencode. +- The file-based surface is fully within plugin scope: `emit_state_file()` in `worker_boot.rs` writes atomically on every `WorkerStatus` transition. +- `claw state --output-format json` gives clawhip everything it needs: `status`, `is_ready`, `seconds_since_update`, `trust_gate_cleared`, `last_event`, `updated_at`. +- Polling a local file has lower latency and fewer failure modes than an HTTP round-trip to a sidecar. +- An HTTP state endpoint would require either (a) upstreaming a route to sst/opencode — a multi-week PR cycle with no guarantee of acceptance — or (b) a sidecar process that queries `WorkerRegistry` in-process, which is fragile and adds an extra failure domain. + +**What downstream tooling (clawhip) should do:** +1. After `WorkerCreate`, poll `.claw/worker-state.json` (or run `claw state --output-format json`) in the worker's CWD at whatever interval makes sense (e.g. 5s). +2. Trust `seconds_since_update > 60` in `trust_required` status as the stall signal. +3. Call `WorkerResolveTrust` tool to unblock, or `WorkerRestart` to reset. + +**HTTP endpoint tracking:** Not scheduled. If a concrete use case emerges that file polling cannot serve (e.g. remote workers over a network boundary), open a new issue to upstream a `/worker/state` route to sst/opencode at that time. Until then: file/CLI is canonical.