From 28a37fbeddd3517c33b863f6c152c0ac3905677b Mon Sep 17 00:00:00 2001 From: Yeachan-Heo Date: Sun, 26 Apr 2026 05:30:39 +0000 Subject: [PATCH] roadmap: #269 filed --- ROADMAP.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/ROADMAP.md b/ROADMAP.md index 1cb4afb..13c7ab9 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -16992,3 +16992,15 @@ Discovery-pattern continuation: this is the **fourth complementary-pinpoint-pair Required fix shape: (a) add `notifications/tools/list_changed` notification handler on the JSON-RPC stdio reader (parallel to #254's resources handler) routing to a per-server channel; (b) add `pub enum ToolCatalogLifecycleEvent { ToolListChanged | ToolAdded(McpTool) | ToolRemoved { qualified_name: String } | ToolSchemaChanged { qualified_name: String, old_schema: JsonValue, new_schema: JsonValue } }` typed event surfaced through `LaneEvents`; (c) add `RuntimeMcpState::refresh_tool_catalog(&mut self) -> Result` that re-runs `manager.discover_tools_best_effort()` and diffs against the previous snapshot, emitting `ToolCatalogLifecycleEvent`s for the delta; (d) instrument the resume entrypoint at `main.rs:2974` (`resume_session`) to instantiate `RuntimeMcpState` (or a lightweight liveness-only variant) when the session has any MCP server configured, refresh the catalog, and surface the diff in the resume-mode `/mcp` output rather than dumping the static config; (e) add `revision: u64` and optional `etag: Option` to `ManagedMcpTool`/`McpTool` so persisted session JSONL turns can record `tool_catalog_revision` per turn; (f) extend `Session` with `pub last_tool_catalog_revision: Option` (and bump `SESSION_VERSION` from `1` to `2` per #259); (g) advertise `tools.listChanged = true` in the initialize handshake at `mcp_stdio.rs:1400` when the runtime supports it; (h) expose `/mcp tools refresh` slash command and `claw mcp tools refresh` CLI subcommand; (i) emit a typed `mcp_tool_catalog_stale` warning to `--output-format json` when resume detects the catalog has diverged from the snapshot embedded in the last session turn. Acceptance: an MCP server that adds a tool between session-end and `claw --resume ` causes the resumed `/mcp` output to show `+1 tool added: srv__new_tool` rather than the static configured-server list with no live cross-check; an MCP server that emits `notifications/tools/list_changed` mid-session causes a `ToolListChanged` lane event and refreshes the tool registry rather than being silently dropped; the persisted session JSONL records `tool_catalog_revision` per turn so post-hoc audit can identify which catalog snapshot the assistant reasoned over. **Status:** Open. No source code changed. Filed 2026-04-26 14:08 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `d90b5f0` before filing (post-rebase fast-forward onto gaebal-gajae's #267 `prompt TEXT` greedy-slurp pinpoint). Cluster delta: Session-resume-tool-catalog-staleness cluster 0→1 (founder, NEW SOLO CLUSTER); complementary-pinpoint-pair-bundle discovery-pattern extended to 4 bundles total (#245+#250 WebSearch, #262+#264 turn-budget, #264+#266 typed-error-axis, #254+#268 MCP-resources/tools-lifecycle-axis-pair). Sister: #254 (MCP Resources lifecycle on the data-handle axis; #268 is the tool-handle axis sister). Smaller-scope by design (matches #253/#254/#257/#258/#260/#261/#262/#263/#264/#265/#266/#267 context-budget discipline). Distinct from #266 (typed-error enum vs missing-refresh primitive — orthogonal axes). Distinct from #259 session-state schema gaps (#259 catalogues what's in JSONL; #268 catalogues what `Session` should have but does not — the `last_tool_catalog_revision` field). Concurrent-dogfood-rebase parity will be confirmed local==origin==fork at HEAD `d90b5f0+#268` after push. + +## Pinpoint #269 — Dogfood status transport lacks channel-aware payload budgeting and delivery receipts: long compact reports can truncate mid-stanza or fail as `Message failed` while the cycle still treats the report as posted + +Dogfooded 2026-04-26 14:30 KST from the live `#clawcode-building-in-public` dogfood loop immediately after #268. The status reporter attempted to publish a growing same-day summary; the visible Discord output truncated mid-sentence at #263 (`**#263** — \`--compact\` help text claims text…`), then a later helper message self-reported `Truncated mid-sentence at #263`, and the cron emitted `Cron job "clawcode-dogfood-cycle-reminder" failed: ⚠️ ✉️ Message failed` followed by another timeout. The loop still printed meta-prose saying the Discord report was posted, even though the transport evidence showed partial delivery/failure. + +Concrete failure mode: the dogfood status path can exceed a channel/provider payload limit and either (a) deliver only a prefix, cutting a pinpoint stanza in half, or (b) fail after attempting delivery, while the cycle's own state/reporting does not carry a typed `delivery_status` / `delivered_bytes` / `truncated_at` receipt. Operators then see conflicting truth: a report says "posted", the channel contains a partial report, and cron says `Message failed` or times out. This is distinct from #253 (state-vector context-budget discipline) and #261 (derived count/range self-consistency): those validate what the summary *says*; #269 validates whether the rendered payload fits the target channel and whether delivery actually succeeded. + +Gap. There is no channel-aware pre-send budget gate for dogfood status payloads, no stanza-safe chunker, no checksum/part numbering, and no authoritative delivery receipt bound to the cycle id. A compact summary can be internally fresh (#259) and arithmetically consistent (#261) yet still be operationally unusable because the transport cuts it mid-stanza or the message send fails after side effects. The status generator also lacks a fail-closed rule: a failed/partial send should mark the cycle as `delivery_failed` or `partial_delivery`, not publish/echo a success summary. + +Required fix shape: (a) add per-channel payload budget metadata (`max_chars`, `safe_chars`, markdown overhead, attachment/thread fallback) to dogfood report rendering; (b) preflight-render the report and split into stanza-safe chunks before send, never in the middle of a pinpoint bullet; (c) add part numbering and a short report id/checksum (`dogfood-status d90b5f0 part 1/3`) so downstream claws can detect missing chunks; (d) record/send a typed delivery receipt with `status: delivered|partial|failed`, `message_ids`, `bytes_sent`, `chunks_sent`, `truncated_at`, and provider error; (e) if any chunk fails, emit a compact failure notice and do not mark the report as posted; (f) regression-test a report containing #257-#268-sized entries against the Discord character budget and assert the split boundaries align to pinpoint stanzas. Acceptance: a same-day summary can never silently truncate mid-pinpoint; a send failure produces a typed `delivery_failed` receipt with no contradictory "posted" success prose; cron timeout can distinguish `timed_out_before_send` vs `timed_out_after_partial_send` (#246 sibling). + +**Status:** Open. No source code changed. Filed 2026-04-26 14:32 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `62b20c7` before filing. Cluster delta: dogfood-transport-delivery-receipt +1; sibling to #246 (cron timeout ambiguity), #253 (state-vector budgeting), and #261 (summary self-consistency), but distinct transport/payload-budget layer. Concrete delta this cycle: ROADMAP-only pinpoint appended from live channel failure evidence.