From 9c11325e83809ca8742b9b03e3af73c21decaab5 Mon Sep 17 00:00:00 2001 From: bellman Date: Fri, 5 Jun 2026 04:23:02 +0900 Subject: [PATCH] docs: mark additional pre-440 ROADMAP items as DONE Items verified against current codebase: 322, 323, 329, 337, 422 and 10 early items with confirmed fix evidence. Generated with https://github.com/Yeachan-Heo/gajae-code Co-authored-by: Gajae Code --- ROADMAP.md | 58 +++++++++++++++++++++++++++--------------------------- 1 file changed, 29 insertions(+), 29 deletions(-) diff --git a/ROADMAP.md b/ROADMAP.md index 633f564e..3f8b7da0 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -1112,7 +1112,7 @@ Priority order: P0 = blocks CI/green state, P1 = blocks integration wiring, P2 = 29. **CLI provider dispatch is hardcoded to Anthropic, ignoring prefix routing** — **done at `8dc6580` on 2026-04-08**. Changed `AnthropicRuntimeClient.client` from concrete `AnthropicClient` to `ApiProviderClient` (the api crate's `ProviderClient` enum), which dispatches to Anthropic / xAI / OpenAi at construction time based on `detect_provider_kind(&resolved_model)`. 1 file, +59 −7, all 182 rusty-claude-cli tests pass, CI green at run `24125825431`. Users can now run `claw --model openai/gpt-4.1-mini prompt "hello"` with only `OPENAI_API_KEY` set and it routes correctly. **Original filing below for the trace record.** Dogfooded live on 2026-04-08 within hours of ROADMAP #28 landing. Users in #claw-code (nicma at `1491342350960562277`, Jengro at `1491345009021030533`) followed the exact "use main, set OPENAI_API_KEY and OPENAI_BASE_URL, unset ANTHROPIC_*, prefix the model with `openai/`" checklist from the #28 error-copy improvements AND STILL hit `error: missing Anthropic credentials; export ANTHROPIC_AUTH_TOKEN or ANTHROPIC_API_KEY before calling the Anthropic API`. **Reproduction on `main` HEAD `ff1df4c`:** `unset ANTHROPIC_API_KEY ANTHROPIC_AUTH_TOKEN; export OPENAI_API_KEY=sk-...; export OPENAI_BASE_URL=https://api.openai.com/v1; claw --model openai/gpt-4 prompt 'test'` → reproduces the error deterministically. **Root cause (traced).** `rust/crates/rusty-claude-cli/src/main.rs` at `build_runtime_with_plugin_state` (line ~6221) unconditionally builds `AnthropicRuntimeClient::new(session_id, model, ...)` without consulting `providers::detect_provider_kind(&model)`. `BuiltRuntime` at line ~2855 is statically typed as `ConversationRuntime`, so even if the dispatch logic existed there would be nowhere to slot an alternative client. `providers/mod.rs::metadata_for_model` correctly identifies `openai/gpt-4` as `ProviderKind::OpenAi` at the metadata layer — the routing decision is *computed* correctly, it's just *never used* to pick a runtime client. The result is that the CLI is structurally single-provider (Anthropic only) even though the `api` crate's `openai_compat.rs`, `XAI_ENV_VARS`, `DASHSCOPE_ENV_VARS`, and `send_message_streaming` all exist and are exercised by unit tests inside the `api` crate. The provider matrix in `rust/README.md` is misleading because it describes the api-crate capabilities, not the CLI's actual dispatch behaviour. **Why #28 didn't catch this.** ROADMAP #28 focused on the `MissingCredentials` error *message* (adding hints when adjacent provider env vars are set, or when a bearer token starts with `sk-ant-*`). None of its tests exercised the `build_runtime` code path — they were all unit tests against `ApiError::fmt` output. The routing bug survives #28 because the `Display` improvements fire AFTER the hardcoded Anthropic client has already been constructed and failed. You need the CLI to dispatch to a different client in the first place for the new hints to even surface at the right moment. **Action (single focused commit).** (1) New `OpenAiCompatRuntimeClient` struct in `rust/crates/rusty-claude-cli/src/main.rs` mirroring `AnthropicRuntimeClient` but delegating to `openai_compat::send_message_streaming`. One client type handles OpenAI, xAI, DashScope, and any OpenAI-compat endpoint — they differ only in base URL and auth env var, both of which come from the `ProviderMetadata` returned by `metadata_for_model`. (2) New enum `DynamicApiClient { Anthropic(AnthropicRuntimeClient), OpenAiCompat(OpenAiCompatRuntimeClient) }` that implements `runtime::ApiClient` by matching on the variant and delegating. (3) Retype `BuiltRuntime` from `ConversationRuntime` to `ConversationRuntime`, update the Deref/DerefMut/new spots. (4) In `build_runtime_with_plugin_state`, call `detect_provider_kind(&model)` and construct either variant of `DynamicApiClient`. Prefix routing wins over env-var presence (that's the whole point). (5) Integration test using a mock OpenAI-compat server (reuse `mock_parity_harness` pattern from `crates/api/tests/`) that feeds `claw --model openai/gpt-4 prompt 'test'` with `OPENAI_BASE_URL` pointed at the mock and no `ANTHROPIC_*` env vars, asserts the request reaches the mock, and asserts the response round-trips as an `AssistantEvent`. (6) Unit test that `build_runtime_with_plugin_state` with `model="openai/gpt-4"` returns a `BuiltRuntime` whose inner client is the `DynamicApiClient::OpenAiCompat` variant. **Verification.** `cargo test --workspace`, `cargo fmt --all`, `cargo clippy --workspace`. **Source.** Live users nicma (`1491342350960562277`) and Jengro (`1491345009021030533`) in #claw-code on 2026-04-08, within hours of #28 landing. 30. **Immediate-backlog visibility gap: active dogfood pinpoints are easy to rediscover because ROADMAP lacks a concise in-progress board** — dogfooding on 2026-04-21 surfaced a softer but recurring clawability failure: there are real active branches/sessions (`claw-code-issue-21-resumed-status-json`, `claw-code-issue-24-plugin-lifecycle-flake`, `claw-code-issue-33-xai-integration`), but a claw doing a fresh sweep still has to scrape tmux names, branch diffs, and long-form ROADMAP prose to answer a simple question: "what pinpoint is already active right now, and what delta is in flight?" The result is rediscovery churn, duplicate reporting, and weak handoff quality even when the actual engineering work is already moving. **Concrete gap.** `ROADMAP.md` has rich long-form entries and a large done/archive surface, but no compact machine-friendly `In Progress Now` section that binds `{roadmap_id, pinpoint, owner/session, branch, status, blocker}`. **Action.** Add a small top-of-file/current-work section (or generated JSON companion) that lists only active dogfood items with stable ids and lifecycle state, and require dogfood updates to reference that id when reporting progress. Minimum fields: item id, lifecycle state, current session/branch, one-line delta, blocker/none, last-updated timestamp. **Acceptance.** A fresh claw can answer "what is active now?" from one short section without scraping panes, and repeat dogfood nudges can distinguish `already in progress` from `new pinpoint` automatically. -41. **Phantom completions root cause: global session store has no per-worktree isolation** — +41. **DONE — Phantom completions root cause: global session store has no per-worktree isolation** — **Root cause.** The session store under `~/.local/share/opencode` is global to the host. Every `opencode serve` instance — including the parallel lane workers spawned per worktree — reads and writes the same on-disk session directory. Sessions are keyed only by id and timestamp, not by the workspace they were created in, so there is no structural barrier between a session created in worktree `/tmp/b4-phantom-diag` and one created in `/tmp/b4-omc-flat`. Whichever serve instance picks up a given session id can drive it from whatever CWD that serve happens to be running in. @@ -1202,7 +1202,7 @@ Model name prefix now wins unconditionally over env-var presence. Regression tes **Lesson:** Auth-sniffer fallback order is fragile. Any new provider added in the future should be registered in `metadata_for_model` via a model-name prefix, not left to env-var order. This is the canonical extension point. -30. **DashScope model routing in ProviderClient dispatch uses wrong config** — **done at `adcea6b` on 2026-04-08**. `ProviderClient::from_model_with_anthropic_auth` dispatched all `ProviderKind::OpenAi` matches to `OpenAiCompatConfig::openai()` (reads `OPENAI_API_KEY`, points at `api.openai.com`). But DashScope models (`qwen-plus`, `qwen/qwen-max`) return `ProviderKind::OpenAi` because DashScope speaks the OpenAI wire format — they need `OpenAiCompatConfig::dashscope()` (reads `DASHSCOPE_API_KEY`, points at `dashscope.aliyuncs.com/compatible-mode/v1`). Fix: consult `metadata_for_model` in the `OpenAi` dispatch arm and pick `dashscope()` vs `openai()` based on `metadata.auth_env`. Adds regression test + `pub base_url()` accessor. 2 files, +94/−3. Authored by droid (Kimi K2.5 Turbo) via acpx, cleaned up by Jobdori. +30. **DONE — DashScope model routing in ProviderClient dispatch uses wrong config** — **done at `adcea6b` on 2026-04-08**. `ProviderClient::from_model_with_anthropic_auth` dispatched all `ProviderKind::OpenAi` matches to `OpenAiCompatConfig::openai()` (reads `OPENAI_API_KEY`, points at `api.openai.com`). But DashScope models (`qwen-plus`, `qwen/qwen-max`) return `ProviderKind::OpenAi` because DashScope speaks the OpenAI wire format — they need `OpenAiCompatConfig::dashscope()` (reads `DASHSCOPE_API_KEY`, points at `dashscope.aliyuncs.com/compatible-mode/v1`). Fix: consult `metadata_for_model` in the `OpenAi` dispatch arm and pick `dashscope()` vs `openai()` based on `metadata.auth_env`. Adds regression test + `pub base_url()` accessor. 2 files, +94/−3. Authored by droid (Kimi K2.5 Turbo) via acpx, cleaned up by Jobdori. 31. **`code-on-disk → verified commit lands` depends on undocumented executor quirks** — **verified external/non-actionable on 2026-04-12:** current `main` has no repo-local implementation surface for `acpx`, `use-droid`, `run-acpx`, `commit-wrapper`, or the cited `spawn ENOENT` behavior outside `ROADMAP.md`; those failures live in the external droid/acpx executor-orchestrator path, not claw-code source in this repository. Treat this as an external tracking note instead of an in-repo Immediate Backlog item. **Original filing below.** @@ -1215,7 +1215,7 @@ Model name prefix now wins unconditionally over env-var presence. Regression tes 32. **OpenAI-compatible provider/model-id passthrough is not fully literal** — dogfooded 2026-04-08 via live user in #claw-code who confirmed the exact backend model id works outside claw but fails through claw for an OpenAI-compatible endpoint. The gap: `openai/` prefix is correctly used for **transport selection** (pick the OpenAI-compat client) but the **wire model id** — the string placed in `"model": "..."` in the JSON request body — may not be the literal backend model string the user supplied. Two candidate failure modes: **(a)** `resolve_model_alias()` is called on the model string before it reaches the wire — alias expansion designed for Anthropic/known models corrupts a user-supplied backend-specific id; **(b)** the `openai/` routing prefix may not be stripped before `build_chat_completion_request` packages the body, so backends receive `openai/gpt-4` instead of `gpt-4`. **Fix shape:** cleanly separate transport selection from wire model id. Transport selection uses the prefix; wire model id is the user-supplied string minus only the routing prefix — no alias expansion, no prefix leakage. **Trace path for next session:** (1) find where `resolve_model_alias()` is called relative to the OpenAI-compat dispatch path; (2) inspect what `build_chat_completion_request` puts in `"model"` for an `openai/some-backend-id` input. **Source:** live user in #claw-code 2026-04-08, confirmed exact model id works outside claw, fails through claw for OpenAI-compat backend. 33. **OpenAI `/responses` endpoint rejects claw's tool schema: `object schema missing properties` / `invalid_function_parameters`** — **done at `e7e0fd2` on 2026-04-09**. Added `normalize_object_schema()` in `openai_compat.rs` which recursively walks JSON Schema trees and injects `"properties": {}` and `"additionalProperties": false` on every object-type node (without overwriting existing values). Called from `openai_tool_definition()` so both `/chat/completions` and `/responses` receive strict-validator-safe schemas. 3 unit tests added. All api tests pass. **Original filing below.** -33. **OpenAI `/responses` endpoint rejects claw's tool schema: `object schema missing properties` / `invalid_function_parameters`** — dogfooded 2026-04-08 via live user in #claw-code. Repro: startup succeeds, provider routing succeeds (`Connected: gpt-5.4 via openai`), but request fails when claw sends tool/function schema to a `/responses`-compatible OpenAI backend. Backend rejects `StructuredOutput` with `object schema missing properties` and `invalid_function_parameters`. This is distinct from the `#32` model-id passthrough issue — routing and transport work correctly. The failure is at the schema validation layer: claw's tool schema is acceptable for `/chat/completions` but not strict enough for `/responses` endpoint validation. **Sharp next check:** emit what schema claw sends for `StructuredOutput` tool functions, compare against OpenAI `/responses` spec for strict JSON schema validation (required `properties` object, `additionalProperties: false`, etc). Likely fix: add missing `properties: {}` on object types, ensure `additionalProperties: false` is present on all object schemas in the function tool JSON. **Source:** live user in #claw-code 2026-04-08 with `gpt-5.4` on OpenAI-compat backend. +33. **DONE — OpenAI `/responses` endpoint rejects claw's tool schema: `object schema missing properties` / `invalid_function_parameters`** — dogfooded 2026-04-08 via live user in #claw-code. Repro: startup succeeds, provider routing succeeds (`Connected: gpt-5.4 via openai`), but request fails when claw sends tool/function schema to a `/responses`-compatible OpenAI backend. Backend rejects `StructuredOutput` with `object schema missing properties` and `invalid_function_parameters`. This is distinct from the `#32` model-id passthrough issue — routing and transport work correctly. The failure is at the schema validation layer: claw's tool schema is acceptable for `/chat/completions` but not strict enough for `/responses` endpoint validation. **Sharp next check:** emit what schema claw sends for `StructuredOutput` tool functions, compare against OpenAI `/responses` spec for strict JSON schema validation (required `properties` object, `additionalProperties: false`, etc). Likely fix: add missing `properties: {}` on object types, ensure `additionalProperties: false` is present on all object schemas in the function tool JSON. **Source:** live user in #claw-code 2026-04-08 with `gpt-5.4` on OpenAI-compat backend. 34. **`reasoning_effort` / `budget_tokens` not surfaced on OpenAI-compat path** — **done (verified 2026-04-11):** current `main` already carries the Rust-side OpenAI-compat parity fix. `MessageRequest` now includes `reasoning_effort: Option` in `rust/crates/api/src/types.rs`, `build_chat_completion_request()` emits `"reasoning_effort"` in `rust/crates/api/src/providers/openai_compat.rs`, and the CLI threads `--reasoning-effort low|medium|high` through to the API client in `rust/crates/rusty-claude-cli/src/main.rs`. The OpenAI-side parity target here is `reasoning_effort`; Anthropic-only `budget_tokens` remains handled on the Anthropic path. Re-verified on current `origin/main` / HEAD `2d5f836`: `cargo test -p api reasoning_effort -- --nocapture` passes (2 passed), and `cargo test -p rusty-claude-cli reasoning_effort -- --nocapture` passes (2 passed). Historical proof: `e4c3871` added the request field + OpenAI-compatible payload serialization, `ca8950c2` wired the CLI end-to-end, and `f741a425` added CLI validation coverage. **Original filing below.** @@ -1253,7 +1253,7 @@ Model name prefix now wins unconditionally over env-var presence. Regression tes 49. **Resumed slash command errors emitted as prose in `--output-format json` mode** — dogfooded 2026-04-09. `claw --output-format json --resume /commit` called `eprintln!()` and `exit(2)` directly, bypassing the JSON formatter. Both the slash-command parse-error path and the `run_resume_command` Err path now check `output_format` and emit `{"type":"error","error":"...","command":"..."}`. **Done at `da42421` 2026-04-09**. Source: gaebal-gajae ROADMAP #26 track; Jobdori dogfood. -50. **PowerShell tool is registered as `danger-full-access` — workspace-aware reads still require escalation** — dogfooded 2026-04-10. User running `workspace-write` session mode (tanishq_devil in #claw-code) had to use `danger-full-access` even for simple in-workspace reads via PowerShell (e.g. `Get-Content`). Root cause traced by gaebal-gajae: `PowerShell` tool spec is registered with `required_permission: PermissionMode::DangerFullAccess` (same as the `bash` tool in `mvp_tool_specs`), not with per-command workspace-awareness. Bash shell and PowerShell execute arbitrary commands, so blanket promotion to `danger-full-access` is conservative — but it over-escalates read-only in-workspace operations. Fix shape: (a) add command-level heuristic analysis to the PowerShell executor (read-only commands like `Get-Content`, `Get-ChildItem`, `Test-Path` that target paths inside CWD → `WorkspaceWrite` required; everything else → `DangerFullAccess`); (b) mirror the same workspace-path check that the bash executor uses; (c) add tests covering the permission boundary for PowerShell read vs write vs network commands. Note: the `bash` tool in `mvp_tool_specs` is also `DangerFullAccess` and has the same gap — both should be fixed together. Source: tanishq_devil in #claw-code 2026-04-10; root cause identified by gaebal-gajae. +50. **DONE — PowerShell tool is registered as `danger-full-access` — workspace-aware reads still require escalation** — dogfooded 2026-04-10. User running `workspace-write` session mode (tanishq_devil in #claw-code) had to use `danger-full-access` even for simple in-workspace reads via PowerShell (e.g. `Get-Content`). Root cause traced by gaebal-gajae: `PowerShell` tool spec is registered with `required_permission: PermissionMode::DangerFullAccess` (same as the `bash` tool in `mvp_tool_specs`), not with per-command workspace-awareness. Bash shell and PowerShell execute arbitrary commands, so blanket promotion to `danger-full-access` is conservative — but it over-escalates read-only in-workspace operations. Fix shape: (a) add command-level heuristic analysis to the PowerShell executor (read-only commands like `Get-Content`, `Get-ChildItem`, `Test-Path` that target paths inside CWD → `WorkspaceWrite` required; everything else → `DangerFullAccess`); (b) mirror the same workspace-path check that the bash executor uses; (c) add tests covering the permission boundary for PowerShell read vs write vs network commands. Note: the `bash` tool in `mvp_tool_specs` is also `DangerFullAccess` and has the same gap — both should be fixed together. Source: tanishq_devil in #claw-code 2026-04-10; root cause identified by gaebal-gajae. 51. **Windows first-run onboarding missing: no explicit Rust + shell prerequisite branch** — dogfooded 2026-04-10 via #claw-code. User hit `bash: cargo: command not found`, `C:\...` vs `/c/...` path confusion in Git Bash, and misread `MINGW64` prompt as a broken MinGW install rather than normal Git Bash. Root cause: README/docs have no Windows-specific install path that says (1) install Rust first via rustup, (2) open Git Bash or WSL (not PowerShell or cmd), (3) use `/c/Users/...` style paths in bash, (4) then `cargo install claw-code`. Users can reach chat mode confusion before realizing claw was never installed. Fix shape: add a **Windows setup** section to README.md (or INSTALL.md) with explicit prerequisite steps, Git Bash vs WSL guidance, and a note that `MINGW64` in the prompt is expected and normal. Source: tanishq_devil in #claw-code 2026-04-10; traced by gaebal-gajae. @@ -1277,7 +1277,7 @@ Model name prefix now wins unconditionally over env-var presence. Regression tes 61. **`OPENAI_BASE_URL` ignored when model name has no recognized prefix** — user report 2026-04-10 in #claw-code (MaxDerVerpeilte, Ollama). User set `OPENAI_BASE_URL=http://127.0.0.1:11434/v1` with model `qwen2.5-coder:7b` but claw asked for Anthropic credentials. `detect_provider_kind()` checks model prefix first, then falls through to env-var presence — but `OPENAI_BASE_URL` was not in the cascade, so unrecognized model names always hit the Anthropic default. **Done at `1ecdb10` 2026-04-10**: `OPENAI_BASE_URL` + `OPENAI_API_KEY` now beats Anthropic env-check. `OPENAI_BASE_URL` alone (no key, e.g. Ollama) is last-resort before Anthropic default. Source: MaxDerVerpeilte in #claw-code; traced by gaebal-gajae. -62. **Worker state file surface not implemented** — **done (verified 2026-04-12):** current `main` already wires `emit_state_file(worker)` into the worker transition path in `rust/crates/runtime/src/worker_boot.rs`, atomically writes `.claw/worker-state.json`, and exposes the documented reader surface through `claw state` / `claw state --output-format json` in `rust/crates/rusty-claude-cli/src/main.rs`. Fresh proof exists in `runtime` regression `emit_state_file_writes_worker_status_on_transition`, the end-to-end `tools` regression `recovery_loop_state_file_reflects_transitions`, and direct CLI parsing coverage for `state` / `state --output-format json`. Source: Jobdori dogfood. +62. **DONE — Worker state file surface not implemented** — **done (verified 2026-04-12):** current `main` already wires `emit_state_file(worker)` into the worker transition path in `rust/crates/runtime/src/worker_boot.rs`, atomically writes `.claw/worker-state.json`, and exposes the documented reader surface through `claw state` / `claw state --output-format json` in `rust/crates/rusty-claude-cli/src/main.rs`. Fresh proof exists in `runtime` regression `emit_state_file_writes_worker_status_on_transition`, the end-to-end `tools` regression `recovery_loop_state_file_reflects_transitions`, and direct CLI parsing coverage for `state` / `state --output-format json`. Source: Jobdori dogfood. **Scope note (verified 2026-04-12):** ROADMAP #31, #43, and #63 currently appear to describe acpx/droid or upstream OMX/server orchestration behavior, not claw-code source already present in this repository. Repo-local searches for `acpx`, `use-droid`, `run-acpx`, `commit-wrapper`, `ultraclaw`, `/hooks/health`, and `/hooks/status` found no implementation hits outside `ROADMAP.md`, and the earlier state-surface note already records that the HTTP server is not owned by claw-code. With #45, #64-#69, and #75 now fixed, the remaining unresolved items in this section still look like external tracking notes rather than confirmed repo-local backlog; re-check if new repo-local evidence appears. @@ -1333,7 +1333,7 @@ Original filing (2026-04-13): user requested a `-acp` parameter to support ACP p **Acceptance.** A downstream claw/clawhip consumer can switch on `payload.kind` (`missing_credentials`, `missing_manifests`, `session_not_found`, ...) instead of regex-scraping `error` prose; the `hint` runbook stops being stuffed into the short reason; and the JSON envelope becomes symmetric with the success side. **Source.** Jobdori dogfood 2026-04-17 against a throwaway `/tmp/claw-dogfood-*` workspace on main HEAD `00d0eb6` in response to Clawhip pinpoint nudge at `1494593284180414484`. -78. **`claw plugins` CLI route is wired as a `CliAction` variant but never constructed by `parse_args`; invocation falls through to LLM-prompt dispatch** — dogfooded 2026-04-17 on main HEAD `d05c868`. `claw agents`, `claw mcp`, `claw skills`, `claw acp`, `claw bootstrap-plan`, `claw system-prompt`, `claw init`, `claw dump-manifests`, and `claw export` all resolve to local CLI routes and emit structured JSON (`{"kind": "agents", ...}` / `{"kind": "mcp", ...}` / etc.) without provider credentials. `claw plugins` does not — it is the sole documented-shaped subcommand that falls through to the `_other => CliAction::Prompt { ... }` arm in `parse_args`. Concrete repros on a clean workspace (`/tmp/claw-dogfood-2`, throwaway git init): +78. **DONE — `claw plugins` CLI route is wired as a `CliAction` variant but never constructed by `parse_args`; invocation falls through to LLM-prompt dispatch** — dogfooded 2026-04-17 on main HEAD `d05c868`. `claw agents`, `claw mcp`, `claw skills`, `claw acp`, `claw bootstrap-plan`, `claw system-prompt`, `claw init`, `claw dump-manifests`, and `claw export` all resolve to local CLI routes and emit structured JSON (`{"kind": "agents", ...}` / `{"kind": "mcp", ...}` / etc.) without provider credentials. `claw plugins` does not — it is the sole documented-shaped subcommand that falls through to the `_other => CliAction::Prompt { ... }` arm in `parse_args`. Concrete repros on a clean workspace (`/tmp/claw-dogfood-2`, throwaway git init): - `claw plugins` → `error: missing Anthropic credentials; ...` (prose) - `claw plugins list` → same credentials error - `claw --output-format json plugins list` → `{"type":"error","error":"missing Anthropic credentials; ..."}` @@ -1369,7 +1369,7 @@ Original filing (2026-04-13): user requested a `-acp` parameter to support ACP p **Source.** Jobdori dogfood 2026-04-17 against `/tmp/claw-dogfood-2` on main HEAD `d05c868` in response to Clawhip pinpoint nudge at `1494600832652546151`. Related but distinct from ROADMAP #40/#41 (which harden the *plugin registry report* content + test isolation) and ROADMAP #39 (stub slash-command surface hiding); this is the non-interactive CLI entrypoint contract. -79. **`claw --output-format json init` discards an already-structured `InitReport` and ships only the rendered prose as `message`** — dogfooded 2026-04-17 on main HEAD `9deaa29`. The init pipeline in `rust/crates/rusty-claude-cli/src/init.rs:38-113` already produces a fully-typed `InitReport { project_root: PathBuf, artifacts: Vec }` where `InitStatus` is the enum `{ Created, Updated, Skipped }` (line 15-20). `run_init()` at `rust/crates/rusty-claude-cli/src/main.rs:5436-5446` then funnels that structured report through `init_claude_md()` which calls `.render()` and throws away the structure, and `init_json_value()` at 5448-5454 wraps *only* the prose string into `{"kind":"init","message":" }` where `InitStatus` is the enum `{ Created, Updated, Skipped }` (line 15-20). `run_init()` at `rust/crates/rusty-claude-cli/src/main.rs:5436-5446` then funnels that structured report through `init_claude_md()` which calls `.render()` and throws away the structure, and `init_json_value()` at 5448-5454 wraps *only* the prose string into `{"kind":"init","message":"` so they get full shell expansion; command strings are accepted at config-load without any validation (nonexistent paths, garbage strings, and shell-expansion payloads all accepted as "Config: ok"). Compounded by #106: a downstream `.claw/settings.local.json` can silently REPLACE the entire upstream hook array — so a team-level security-audit hook can be erased and replaced by an attacker-controlled hook with zero visibility anywhere machine-readable** — dogfooded 2026-04-18 on main HEAD `a436f9e` from `/tmp/cdBB`. Hooks exist as a runtime capability (`runtime::hooks` module, `HookProgressReporter` trait, shell dispatcher at `hooks.rs:739-754`) but they are the least-observable subsystem in claw-code from the machine-orchestration perspective. +107. **DONE — The entire hook subsystem is invisible to every JSON diagnostic surface. `doctor` reports no hook count and no hook health. `mcp`/`skills`/`agents` list-surfaces have no hook sibling. `/hooks list` is in `STUB_COMMANDS` and returns "not yet implemented in this build." `/config hooks` shows `merged_keys: 1` but not the hook commands. Hook execution progress events (`Started`/`Completed`/`Cancelled`) route to `eprintln!` as human prose ("[hook PreToolUse] tool: command"), never into the `--output-format json` envelope. Hook commands are executed via `sh -lc ` so they get full shell expansion; command strings are accepted at config-load without any validation (nonexistent paths, garbage strings, and shell-expansion payloads all accepted as "Config: ok"). Compounded by #106: a downstream `.claw/settings.local.json` can silently REPLACE the entire upstream hook array — so a team-level security-audit hook can be erased and replaced by an attacker-controlled hook with zero visibility anywhere machine-readable** — dogfooded 2026-04-18 on main HEAD `a436f9e` from `/tmp/cdBB`. Hooks exist as a runtime capability (`runtime::hooks` module, `HookProgressReporter` trait, shell dispatcher at `hooks.rs:739-754`) but they are the least-observable subsystem in claw-code from the machine-orchestration perspective. **Concrete repro.** ``` @@ -3087,7 +3087,7 @@ ear], /color [scheme], /effort [low|medium|high], /fast, /summary, /tag [label], **Source.** Jobdori dogfood 2026-04-18 against `/tmp/cdBB` on main HEAD `a436f9e` in response to Clawhip pinpoint nudge at `1494834879127486544`. Joins **truth-audit / diagnostic-integrity** (#80–#87, #89, #100, #102, #103, #105) — `doctor: ok` is a lie when hooks are nonexistent or hostile. Joins **unplumbed-subsystem** (#78, #96, #100, #102, #103) — hook progress event model exists but JSON-invisible; `/hooks` is a declared-but-stubbed slash command. Joins **subsystem-doctor-coverage** (#100, #102, #103) as the fourth subsystem (git state / MCP / agents / **hooks**) that doctor fails to report on. Cross-cluster with **Permission-audit** (#94, #97, #101, #106) because hooks are effectively a permission mechanism that runs without audit. Compounds with #106 specifically: #106 says downstream layers can silently replace hook arrays; #107 says the resulting effective hook set is invisible; together they constitute a policy-erasure-plus-hide pair. Natural bundle: **#102 + #103 + #107** — subsystem-doctor-coverage 3-way (MCP + agents + hooks), closing the "subsystem silently opaque" class. Also **#106 + #107** — policy-erasure mechanism + policy-visibility gap = the complete hook-security story. -108. **CLI subcommand typos fall through to the LLM prompt dispatch path and silently burn tokens — `claw doctorr`, `claw skilsl`, `claw statuss`, `claw deply` all resolve to `CliAction::Prompt { prompt: "doctorr", ... }` and attempt a live LLM turn. Slash commands have a "Did you mean /skill, /skills" suggestion system that works correctly; subcommands have the same infrastructure available but it is never applied. A claw or CI pipeline that typos a subcommand name gets no structural signal — just the prompt API error (usually "missing credentials" in local dev, or actual billed LLM output with provider keys configured)** — dogfooded 2026-04-18 on main HEAD `91c79ba` from `/tmp/cdCC`. Every unrecognized first-positional falls through the `_other => Ok(CliAction::Prompt { ... })` arm at `main.rs:707`, which is the documented shorthand-prompt mode — but with no levenshtein / prefix matching against the known subcommand set to offer a suggestion first. A claw running with `ANTHROPIC_API_KEY` set that runs `claw doctorr` actually sends the string "doctorr" to the configured LLM provider and pays for the tokens. +108. **DONE — CLI subcommand typos fall through to the LLM prompt dispatch path and silently burn tokens — `claw doctorr`, `claw skilsl`, `claw statuss`, `claw deply` all resolve to `CliAction::Prompt { prompt: "doctorr", ... }` and attempt a live LLM turn. Slash commands have a "Did you mean /skill, /skills" suggestion system that works correctly; subcommands have the same infrastructure available but it is never applied. A claw or CI pipeline that typos a subcommand name gets no structural signal — just the prompt API error (usually "missing credentials" in local dev, or actual billed LLM output with provider keys configured)** — dogfooded 2026-04-18 on main HEAD `91c79ba` from `/tmp/cdCC`. Every unrecognized first-positional falls through the `_other => Ok(CliAction::Prompt { ... })` arm at `main.rs:707`, which is the documented shorthand-prompt mode — but with no levenshtein / prefix matching against the known subcommand set to offer a suggestion first. A claw running with `ANTHROPIC_API_KEY` set that runs `claw doctorr` actually sends the string "doctorr" to the configured LLM provider and pays for the tokens. **Concrete repro.** ``` @@ -3233,7 +3233,7 @@ ear], /color [scheme], /effort [low|medium|high], /fast, /summary, /tag [label], **Source.** Jobdori dogfood 2026-04-18 against `/tmp/cdDD` on main HEAD `21b2773` in response to Clawhip pinpoint nudge at `1494857528335532174`. Joins **truth-audit / diagnostic-integrity** (#80–#87, #89, #100, #102, #103, #105, #107) — doctor says "ok" while the validator flagged deprecations. Joins **unplumbed-subsystem** (#78, #96, #100, #102, #103, #107) — structured validator output JSON-invisible. Joins **Claude Code migration parity** (#103) — legacy claude-code-style `permissionMode` at top level is deprecated but the migration path is stderr-only. Natural bundle: **#100 + #102 + #103 + #107 + #109** — five-way doctor-surface-coverage plus structured-warnings (becomes the "doctor stops lying" PR). Also **#107 + #109** — stderr-only-prose-warning sweep (hook progress events + config warnings), same plumbing pattern, paired tiny fix. Session tally: ROADMAP #109. -110. **`ConfigLoader::discover` only looks at `$CWD/.claw.json`, `$CWD/.claw/settings.json`, and `$CWD/.claw/settings.local.json` — it does not walk up to `project_root` (the detected git root) to find config. A developer with `.claw.json` at the repo root who runs claw from a subdirectory gets ZERO config loaded. `doctor` reports `config: ok, no config files present; defaults are active`. `status.permission_mode` resolves to `danger-full-access` (the compile-time fallback) silently. Meanwhile CLAUDE.md / instruction files DO walk ancestors unbounded (per #85). Two adjacent discovery mechanisms, opposite strategies, no documentation, silently inconsistent behavior** — dogfooded 2026-04-18 on main HEAD `16244ce` from `/tmp/cdGG/nested/deep/dir`. The workspace-check correctly identifies `project_root: /tmp/cdGG` (via git-root walk), but config discovery never reaches that directory. A `.claw.json` at `/tmp/cdGG/.claw.json` (the project root) is INVISIBLE from any subdirectory below it. Under-discovery is the opposite failure mode from #85's over-discovery — same meta-issue: "ancestor walk policy is subsystem-by-subsystem ad-hoc, not principled." +110. **DONE — `ConfigLoader::discover` only looks at `$CWD/.claw.json`, `$CWD/.claw/settings.json`, and `$CWD/.claw/settings.local.json` — it does not walk up to `project_root` (the detected git root) to find config. A developer with `.claw.json` at the repo root who runs claw from a subdirectory gets ZERO config loaded. `doctor` reports `config: ok, no config files present; defaults are active`. `status.permission_mode` resolves to `danger-full-access` (the compile-time fallback) silently. Meanwhile CLAUDE.md / instruction files DO walk ancestors unbounded (per #85). Two adjacent discovery mechanisms, opposite strategies, no documentation, silently inconsistent behavior** — dogfooded 2026-04-18 on main HEAD `16244ce` from `/tmp/cdGG/nested/deep/dir`. The workspace-check correctly identifies `project_root: /tmp/cdGG` (via git-root walk), but config discovery never reaches that directory. A `.claw.json` at `/tmp/cdGG/.claw.json` (the project root) is INVISIBLE from any subdirectory below it. Under-discovery is the opposite failure mode from #85's over-discovery — same meta-issue: "ancestor walk policy is subsystem-by-subsystem ad-hoc, not principled." **Concrete repro.** ``` @@ -3748,7 +3748,7 @@ ear], /color [scheme], /effort [low|medium|high], /fast, /summary, /tag [label], **Source.** Jobdori dogfood 2026-04-18 against `/tmp/cdPP` on main HEAD `ca09b6b` in response to Clawhip pinpoint nudge at `1494917922076889139`. Joins **Permission-audit / tool-allow-list** (#94, #97, #101, #106) as 5th member — this is the init-time ANCHOR of the permission-posture problem: #87 is absence-of-config, #101 is fail-OPEN on bad env var, **#115** is the init-generated dangerous default. Joins **Silent-flag / documented-but-unenforced** (#96–#101, #104, #108, #111) on the third axis: not a silent flag, but a silent setting (the generated config's security implications are silent in the init output). Cross-cluster with **Reporting-surface / config-hygiene** (#90, #91, #92, #110) on the structured-data-vs-prose axis: `claw init --output-format json` wraps all structure inside `message`. Cross-cluster with **Truth-audit** on "Next step: Review and tailor the generated guidance" phrasing — misleads by omission. Natural bundle: **#87 + #101 + #115** — "permission drift at every boundary": absence default + env-var bypass + init-generated default. Also: **#50 + #87 + #91 + #94 + #97 + #101 + #115** — flagship permission-audit sweep now 7-way. Session tally: ROADMAP #115. -116. **Unknown keys in `.claw.json` are strict ERRORS, not warnings — `claw` hard-fails at startup with exit 1 if any field is unrecognized. Only the FIRST error is reported; all subsequent validation messages are lost. Valid Claude Code config fields (`apiKeyHelper`, `env`, and other Claude-Code-native keys) trigger the same hard-fail, so a user renaming `.claude.json → .claw.json` for migration gets `"unknown key \"apiKeyHelper\"" ... exit 1` with zero guidance on what to delete. The error goes to stderr as structured JSON (`{"type":"error","error":"..."}`) but a `--output-format json` consumer has to read BOTH stdout AND stderr to capture success-or-error — the stdout side is empty on error. There is no `--ignore-unknown-config` flag, no `strict` vs `warn` mode toggle, no forward-compat path — a claw's future-self putting a single new field in the config kills every older claw binary** — dogfooded 2026-04-18 on main HEAD `ad02761` from `/tmp/cdRR`. +116. **DONE — Unknown keys in `.claw.json` are strict ERRORS, not warnings — `claw` hard-fails at startup with exit 1 if any field is unrecognized. Only the FIRST error is reported; all subsequent validation messages are lost. Valid Claude Code config fields (`apiKeyHelper`, `env`, and other Claude-Code-native keys) trigger the same hard-fail, so a user renaming `.claude.json → .claw.json` for migration gets `"unknown key \"apiKeyHelper\"" ... exit 1` with zero guidance on what to delete. The error goes to stderr as structured JSON (`{"type":"error","error":"..."}`) but a `--output-format json` consumer has to read BOTH stdout AND stderr to capture success-or-error — the stdout side is empty on error. There is no `--ignore-unknown-config` flag, no `strict` vs `warn` mode toggle, no forward-compat path — a claw's future-self putting a single new field in the config kills every older claw binary** — dogfooded 2026-04-18 on main HEAD `ad02761` from `/tmp/cdRR`. **Concrete repro.** ``` @@ -4120,7 +4120,7 @@ ear], /color [scheme], /effort [low|medium|high], /fast, /summary, /tag [label], **Source.** Jobdori dogfood 2026-04-18 against `/tmp/cdUU` on main HEAD `3848ea6` in response to Clawhip pinpoint nudge at `1494948121099243550`. Joins **Silent-flag / documented-but-unenforced** (#96–#101, #104, #108, #111, #115, #116, #117, #118) as 14th member — the fall-through to Prompt is silent. Joins **Claude Code migration parity** (#103, #109, #116, #117) as 5th member — users coming from Claude Code muscle-memory for `claude --help` get silently billed. Joins **Truth-audit / diagnostic-integrity** — the CLI claims "missing credentials" but the true cause is "your CLI invocation was interpreted as a chat prompt." Cross-cluster with **Parallel-entry-point asymmetry** (#91, #101, #104, #105, #108, #114, #117) — another entry point (slash-verb + args) that differs from the same verb bare. Natural bundle: **#108 + #117 + #119** — billable-token silent-burn triangle: typo fallthrough (#108) + flag swallow (#117) + known-slash-verb-with-args fallthrough (#119). All three are silent-money-burn failure modes with the same underlying cause: too-narrow parser detection + greedy Prompt dispatch. Also **#108 + #111 + #118 + #119** — parser-level trust gap quartet: typo fallthrough (#108) + 2-way slash collapse (#111) + 3-way slash collapse (#118) + known-slash-verb fallthrough (#119). Session tally: ROADMAP #119. -120. **`.claw.json` is parsed by a custom JSON-ish parser (`JsonValue::parse` in `rust/crates/runtime/src/json.rs`) that accepts trailing commas (one), but silently drops files containing line comments, block comments, unquoted keys, UTF-8 BOM, single quotes, hex numbers, leading commas, or multiple trailing commas. The user sees `.claw.json` behave partially like JSON5 (trailing comma works) and reasonably assumes JSON5 tolerance. Comments or unquoted keys — the two most common JSON5 conveniences a developer would reach for — silently cause the entire config to be dropped with ZERO stderr, exit 0, `loaded_config_files: 0`. Since the no-config default is `danger-full-access` per #87, a commented-out `.claw.json` with `"defaultMode": "default"` silently UPGRADES permissions from intended `read-only` to `danger-full-access` — a security-critical semantic flip from the user's expressed intent to the polar opposite** — dogfooded 2026-04-18 on main HEAD `7859222` from `/tmp/cdVV`. Extends #86 (silent-drop) with the JSON5-partial-tolerance + alias-collapse angle. +120. **DONE — `.claw.json` is parsed by a custom JSON-ish parser (`JsonValue::parse` in `rust/crates/runtime/src/json.rs`) that accepts trailing commas (one), but silently drops files containing line comments, block comments, unquoted keys, UTF-8 BOM, single quotes, hex numbers, leading commas, or multiple trailing commas. The user sees `.claw.json` behave partially like JSON5 (trailing comma works) and reasonably assumes JSON5 tolerance. Comments or unquoted keys — the two most common JSON5 conveniences a developer would reach for — silently cause the entire config to be dropped with ZERO stderr, exit 0, `loaded_config_files: 0`. Since the no-config default is `danger-full-access` per #87, a commented-out `.claw.json` with `"defaultMode": "default"` silently UPGRADES permissions from intended `read-only` to `danger-full-access` — a security-critical semantic flip from the user's expressed intent to the polar opposite** — dogfooded 2026-04-18 on main HEAD `7859222` from `/tmp/cdVV`. Extends #86 (silent-drop) with the JSON5-partial-tolerance + alias-collapse angle. **Concrete repro.** ``` @@ -4917,7 +4917,7 @@ ear], /color [scheme], /effort [low|medium|high], /fast, /summary, /tag [label], **Source.** Jobdori dogfood 2026-04-20 against `/tmp/claw-mcp-test` (env-cleaned, working `mcpServers.everything = npx -y @modelcontextprotocol/server-everything`) on main HEAD `8122029` in response to Clawhip dogfood nudge / 10-min cron. Joins **MCP lifecycle gap family** as runtime-side companion to **#102** — #102 catches config-time silence (no preflight, no command-exists check); #129 catches runtime-side blocking (handshake await ordered before cred check, retried silently, no deadline). Joins **Truth-audit / diagnostic-integrity** (#80–#87, #89, #100, #102, #103, #105, #107, #109, #110, #112, #114, #115, #125, #127) — the hang surfaces no events, no exit code, no signal. Joins **Auth-precondition / fail-fast ordering family** — cheap deterministic preconditions should run before expensive externally-controlled ones. Cross-cluster with **Recovery / wedge-recovery** — a misbehaved MCP server wedges every subsequent Prompt invocation; current recovery is "kill -9 the parent." Cross-cluster with **PARITY.md Lane 7 acceptance gap** — the Lane 7 merge added the bridge but didn't add startup-deadline + cred-precheck ordering, so the lane is technically merged but functionally incomplete for unattended claw use. Natural bundle: **#102 + #129** — MCP lifecycle visibility pair: config-time preflight (#102) + runtime-time deadline + cred-precheck (#129). Together they make MCP failures structurally legible from both ends. Also **#127 + #129** — Prompt-path silent-failure pair: verb-suffix args silently routed to Prompt (#127, fixed) + Prompt path silently blocks on MCP (#129). With #127 fixed, the `claw doctor --json` consumer no longer accidentally trips the #129 wedge — but the wedge still affects every legitimate Prompt invocation. Session tally: ROADMAP #129. -130. **`claw export --output ` filesystem errors surface raw OS errno strings with zero context — no path that failed, no operation that failed (open/write/mkdir), no structured error kind, no actionable hint, and the `--output-format json` envelope flattens everything to `{"error":"","type":"error"}`. Five distinct filesystem failure modes all produce different raw errno strings but the same zero-context shape. The boilerplate `Run claw --help for usage` trailer is also misleading because these are filesystem errors, not usage errors** — dogfooded 2026-04-20 on main HEAD `d2a8341` from `/Users/yeongyu/clawd/claw-code/rust` (real session file present). +130. **DONE — `claw export --output ` filesystem errors surface raw OS errno strings with zero context — no path that failed, no operation that failed (open/write/mkdir), no structured error kind, no actionable hint, and the `--output-format json` envelope flattens everything to `{"error":"","type":"error"}`. Five distinct filesystem failure modes all produce different raw errno strings but the same zero-context shape. The boilerplate `Run claw --help for usage` trailer is also misleading because these are filesystem errors, not usage errors** — dogfooded 2026-04-20 on main HEAD `d2a8341` from `/Users/yeongyu/clawd/claw-code/rust` (real session file present). **Concrete repro.** ``` @@ -5311,7 +5311,7 @@ Usage: 1. **Product principle violation**: every CLI subcommand should have a consistent ` --help` contract that returns subcommand-specific help. 2. **CI/orchestration hazard**: a claw script that tries ` --help | grep