diff --git a/ROADMAP.md b/ROADMAP.md index caa656df..e997f260 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -6491,7 +6491,7 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed) Same parser, same `--resume ` shape, **only `--version` short-circuits**. **Root cause (traced):** `parse_args` in `rust/crates/rusty-claude-cli/src/main.rs:630-654` has asymmetric `--help`/`--version` arms. `"--help" | "-h" if rest.is_empty() => { wants_help = true; index += 1; }` at line 632, plus a second `--help` arm at lines 636-650 gated by `if !rest.is_empty() && matches!(rest[0].as_str(), "prompt" | "commit" | "pr" | "issue")` — both guards exclude `rest[0] == "--resume"`. Meanwhile `"--version" | "-V" => { wants_version = true; index += 1; }` at lines 651-654 has **no guard at all** — version always captures regardless of `rest` state. The `--resume` arm at line 762 (`"--resume" if rest.is_empty() => { rest.push("--resume".to_string()); index += 1; }`) pushes `"--resume"` into `rest` and advances by ONE, so the parser continues. On the next iteration with `rest = ["--resume"]`: `--help` fails both guards → falls through to the catch-all `other => { rest.push(other.to_string()); index += 1; }` at lines 793-796 → `rest = ["--resume", "--help"]` → resume dispatch later treats `--help` as the session-id positional argument → `session not found: --help` → exit 1. `--version` instead matches its unguarded arm → `wants_version = true` → the final `if wants_version { return Ok(CliAction::Version { ... }); }` at lines 803-805 short-circuits before resume dispatch ever runs. **Why distinct from existing items:** #117 (lines 3911-3913 in ROADMAP) covers `claw -p "test" --help` → `-p` swallows everything via greedy `args[index + 1..].join(" ")` and a hardcoded `return` from inside `-p`'s arm; that's a `-p`-specific greedy-swallow bug. This pinpoint is the **opposite asymmetry**: `--resume` is *not* greedy — it correctly advances by one and lets the parser continue — but the `--help` arm's `rest.is_empty()` guard plus the 4-name allowlist (`prompt|commit|pr|issue`) excludes `--resume` from the allowlist while `--version` has no such guard. The result: under `--resume`, version still works but help is unreachable. #2186 (existing `--help` Resume-safe block) tracks the inverse — `claw --help` itself lying about which resume-safe commands work — not "you cannot get to `--help` from a `--resume` line at all." #21/#55/#113 cover REPL-only session verbs (`/session list`/`/session switch`/etc.), not the CLI-side `--help` accessibility on a `--resume` line. #117's fix is "make `-p` non-greedy"; this fix is "give `--help` the same unguarded short-circuit as `--version`, OR add `--resume` to the help allowlist." **Why it matters:** (a) **Help discoverability is broken for the most common entry point.** A user typing `claw --resume ` is exploring; the next muscle-memory step is `claw --resume --help` to see "what's the resume subcommand syntax?" Instead they get `session not found: --help` with no indication that `--help` is a flag, not a session id. The hint text even directs them to `/session list` which they have no way to invoke from the same `--help`-blocked CLI. (b) **CLI/version parity contract violated.** Every other `claw --version` and `claw --help` pair returns the corresponding help/version page. `--resume` uniquely breaks the `--help` half of that contract while preserving the `--version` half — there's no documentation anywhere that `--resume` swallows `--help`. (c) **The "session not found: --help" message is actively misleading.** It implies the user provided a malformed session id, not that they tried to read documentation. A claw orchestrator that parses this error envelope (`kind:"session_not_found"`, `error:"…session not found: --help…"`) will not realize the human typed `--help`. (d) **Sibling clawability gap.** The same arm-structure means `help`, `list`, `ls`, `show` are all silently absorbed as session-id literals. There is no CLI way to enumerate sessions — `/session list` (REPL-only, called out in the hint!) is the only path. A non-interactive claw cannot ask "what sessions exist?" without either parsing `.claw/sessions//` directly (bypasses claw bookkeeping) or spawning a TTY (#113 already covers the broader REPL-only session-verb gap, but this pinpoint shows the CLI-side hint message itself is recommending a command the CLI cannot run). **Required fix shape:** (a) **make `--help` short-circuit unconditionally**, matching `--version` — remove the `if rest.is_empty()` and the 4-name allowlist guards on the top-level `--help` arms in `parse_args`; let every `--help` anywhere in argv return `CliAction::Help { output_format }`. The reason for the original guards (per the comment at lines 638-647: "Subcommands that consume their own args (agents, mcp, plugins, skills) and local help-topic subcommands … must NOT be intercepted here — they handle --help in their own dispatch paths via parse_local_help_action()") is to let subcommand-local help win over global help. But `--resume` is **not** a subcommand-local-help path — it's a top-level flag whose `` consumes positional input, and its dispatch has no `parse_local_help_action()` analogue. Either add `--resume` to the allowlist `matches!(rest[0].as_str(), "prompt" | "commit" | "pr" | "issue" | "--resume")` so `--help` after `--resume` triggers top-level help, OR (simpler) drop the global `--help` guards entirely and let any `parse_local_help_action()` path do its own dispatch before `parse_args` runs. (b) **CLI-side session enumeration.** Add a `claw session list` / `claw sessions` top-level subcommand (or an explicit `claw --list-sessions` flag) that emits the same `{kind:"session_list", sessions:[…], active:}` JSON envelope as the REPL `/session list`, so the hint text in every session-not-found error has an actually-callable CLI parallel. Sibling of #113's broader session-verb gap but distinct: this is the minimum-viable list capability, not the full switch/fork/delete matrix. (c) **Better error wording when the "session id" looks like a flag.** When the resume-dispatch session-id resolver receives a string that starts with `-` (e.g. `--help`, `-h`, `--version`, `--output-format`), emit `kind:"flag_swallowed_as_session_id"` with `flag:"--help"`, `hint:"--help must appear before --resume or use claw --help on its own; --resume takes a session id literal, not a flag"`. Same for `latest`-adjacent typos: `list`, `ls`, `show`, `help` could carry `hint:"did you mean to call /session list? CLI equivalent is claw session list"`. (d) **Regression coverage.** Property test asserting `claw --resume --help`, `claw --resume -h`, `claw --resume --help --output-format json`, `claw --resume foo --help`, and `claw --help --resume foo` ALL produce `CliAction::Help { … }` exit 0; assert `claw --resume help`, `claw --resume list`, `claw --resume ls`, `claw --resume show` produce a structured `flag_swallowed_as_session_id` or `unsupported_session_alias` envelope with `kind` set, never the catch-all `session_not_found` bucket. **Acceptance check (one-liner):** `claw --resume --help >/dev/null 2>&1; test $? -eq 0` should pass (and stdout should contain the resume help text). Source: gaebal-gajae dogfood follow-up for the 2026-05-24 08:00 Clawhip pinpoint nudge at message `1508016732986408983`. -458. **There is no portable success-detection field across `claw --output-format json` envelopes — only `kind` is universal; `status`, `action`, `summary`, `message`, and `report` are present on different subsets, so a claw that writes `if response["status"] == "ok"` silently breaks on 7 of the 9 standard subcommands** — dogfooded 2026-05-24 for the 09:00 Clawhip pinpoint nudge at message `1508031832669814834`, reproduced on local `./rust/target/debug/claw` `git_sha 003b739d` (origin/main `f8e1bb72`). Catalog of top-level fields in successful `--output-format json` envelopes for the nine standard subcommands (clean isolated env, no `.claw.json`, fresh git-init workspace): +458. **DONE — `status` field now universal across all JSON envelopes** — verified 2026-06-04: all 9 standard subcommands (`status`, `mcp`, `skills`, `agents`, `doctor`, `sandbox`, `init`, `system-prompt`, `version`) return `status` field. `action` field also universal. | Subcommand | `kind` | `status` | `action` | `summary` | `message` | `report` | |---|---|---|---|---|---|---| @@ -6507,7 +6507,7 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed) **Only `kind` is universal.** Every other top-level discriminator is present on some envelopes and missing on others. A claw orchestrator that does `if json.status == "ok"` works for exactly 2 of 9 subcommands (`status`, `mcp`) and **silently returns `false`/`None` for the other 7 because the field doesn't exist**. A claw that does `if json.action == "list"` works for 3 of 9 (`mcp`, `skills`, `agents`) and breaks for the rest. There is **no single field a claw can read to determine "did this subcommand succeed?"** other than parsing stdout-vs-stderr routing (broken by #447/#450/#340/#341) or checking process exit code (broken by #435/#444). **Why distinct from existing items:** #90/#91/#92/#110/#115/#116/#130 cover **error envelope** shape asymmetry (the `{error, type}` shape vs `{kind, error}` shape); #340/#341/#347/#349/#350 cover specific subcommand envelopes where a not-found/unsupported state is shaped wrong; #121 covers the `doctor` `message`/`report` byte-duplication. **This pinpoint is the cross-subcommand catalog of success envelopes** — the structural fact that no single top-level field is present on every success envelope means **no portable success detection is possible** across the catalog without per-subcommand special-casing. It is the meta-pattern that the per-subcommand bugs above are individual instances of. **Trace:** every envelope is emitted from a separate code path with no shared envelope constructor. `status` JSON at `rust/crates/rusty-claude-cli/src/main.rs:5738` emits `"status": "ok"` directly. `mcp` JSON elsewhere emits both `status:"ok"` and `action:"list"`. `skills`/`agents` emit `action:"list"` and `summary` but no `status`. `doctor` JSON at `main.rs:1905-1923` emits `kind`/`message`/`report`/`has_failures`/`summary{ok,warn,fail,total}`/`checks[]` — no `status` field at all, while every `checks[i]` has its own `status`. `sandbox`/`init`/`system-prompt`/`version` each emit different ad-hoc shapes. There is no `EnvelopeBuilder` / `BaseEnvelope` shared struct that guarantees a uniform success/error discriminator across subcommands. **Why it matters:** structured JSON output exists precisely so claws can write generic dispatch logic. With the current catalog: (a) a claw that wants a single "is this OK?" predicate has to memorize 9 different rules — `status == "ok"` for two, `has_failures == false` for `doctor`, no field at all for `sandbox`/`init`/`system-prompt`/`version`. (b) A claw that wants to lift the human-readable summary has to check `message` for 4 commands, `report` for 1 (`doctor`, where `message == report` byte-for-byte per #121), `summary` for 3, and "no human summary at all" for 2. (c) A generic "render any claw JSON response" UI cannot exist — the consumer must implement a per-`kind` template. (d) New subcommands inherit the chaos: when a contributor adds `claw foo --output-format json`, there is no shared envelope they must conform to, so they invent another shape. The 458 entry locks in the cross-envelope catalog before more subcommands ship. **Required fix shape:** (a) **define a single shared `BaseEnvelope` for all `--output-format json` success responses**: `{kind: "", status: "ok"|"warn"|"error", action: ""|null, summary: "", details: }`. The two universal fields are `kind` + `status`; `action` and `summary` are optional-but-encouraged. (b) **Drop ad-hoc `message`/`report` fields** in favor of putting prose in a documented `summary` (one line) + optional `text_render` (full prose, only if a human-text rendering is genuinely needed in JSON mode, with a clear note that machines should not parse it). (c) **`doctor` rollup**: top-level `status` derived from `has_failures` and `summary.warnings` (`"ok"` when both zero, `"warn"` when warnings>0 and failures=0, `"error"` when failures>0); de-duplicate `message`/`report` (per #121). (d) **Regression coverage**: a single contract test that parses every `claw --output-format json` output, asserts `kind` is the subcommand name and `status ∈ {"ok","warn","error"}`, and rejects any envelope that omits either. (e) **Doc the envelope** in a new `docs/json-envelope-contract.md` so new subcommands have a single template to copy from. **Acceptance check (one-liner):** `for c in status mcp skills agents doctor sandbox init system-prompt version; do claw $c --output-format json 2>&1 | jq -e '.kind and (.status | IN("ok","warn","error"))' || echo "FAIL: $c"; done` should print no FAILs. Source: gaebal-gajae dogfood follow-up for the 2026-05-24 09:00 Clawhip pinpoint nudge at message `1508031832669814834`. -459. **Memory file discovery is hardcoded to literal `CLAUDE.md` / `CLAUDE.local.md` / `.claw/CLAUDE.md` / `.claw/instructions.md` with no case-folding and no cross-tool aliases — `claw.md`, `CLAW.md`, `AGENTS.md` (the cross-tool industry standard used by Codex, OpenAI Agents, GitHub Copilot Agents), `claude.md` (lowercase), `CLAUDE.MD` (uppercase ext), `.claude/CLAUDE.md` (the Claude Code documented subdir location), `GEMINI.md`, and `memory.md` are all invisible, AND `status --output-format json` exposes only an opaque `memory_file_count: N` integer with no `memory_files: [{path, source}]` structured field, so a claw cannot tell WHICH files were picked up or WHY their memory file is being ignored** — dogfooded 2026-05-24 for the 10:00 Clawhip pinpoint nudge at message `1508046936177901639`, reproduced on local `./rust/target/debug/claw` `git_sha 003b739d` (origin/main `f8e1bb72`). Discovery matrix in a clean isolated env (`HOME=/tmp/iso10/home`, fresh `/tmp/iso10/proj` git-init, `claw status --output-format json | jq .workspace.memory_file_count` after creating each file in turn): +459. **DONE — memory file discovery expanded and structured** — verified 2026-06-04: instruction cascade now loads `CLAUDE.md`, `CLAUDE.local.md`, `.claw/CLAUDE.md`, `.claw/instructions.md`, `CLAW.md`, `AGENTS.md`, `.claude/CLAUDE.md`. `memory_files` array with `{path, source}` exposed in both `status` and `system-prompt` JSON. | File | `memory_file_count` | |---|---|