docs: record plugin lifecycle test flake

2026-07-19 05:18:25 +08:00 · 2026-04-06 01:15:30 +00:00
parent 22e3f8c5e3
commit df0908b10e
1 changed files with 1 additions and 0 deletions
@@ -310,6 +310,7 @@ Priority order: P0 = blocks CI/green state, P1 = blocks integration wiring, P2 =
 21. **Resumed `/status` JSON parity gap** — dogfooding shows fresh `claw status --output-format json` now emits structured JSON, but resumed slash-command status still leaks through a text-shaped path in at least one dispatch path. Local CI-equivalent repro fails `rust/crates/rusty-claude-cli/tests/resume_slash_commands.rs::resumed_status_command_emits_structured_json_when_requested` with `expected value at line 1 column 1`, so resumed automation can receive text where JSON was explicitly requested. **Action:** unify fresh vs resumed `/status` rendering through one output-format contract and add regression coverage so resumed JSON output is guaranteed valid.
 22. **Opaque failure surface for session/runtime crashes** — repeated dogfood-facing failures can currently collapse to generic wrappers like `Something went wrong while processing your request. Please try again, or use /new to start a fresh session.` without exposing whether the fault was provider auth, session corruption, slash-command dispatch, render failure, or transport/runtime panic. This blocks fast self-recovery and turns actionable clawability bugs into blind retries. **Action:** preserve a short user-safe failure class (`provider_auth`, `session_load`, `command_dispatch`, `render`, `runtime_panic`, etc.), attach a local trace/session id, and ensure operators can jump from the chat-visible error to the exact failure log quickly.
 23. **`doctor --output-format json` check-level structure gap** — direct dogfooding shows `claw doctor --output-format json` exposes `has_failures` at the top level, but individual check results (`auth`, `config`, `workspace`, `sandbox`, `system`) are buried inside flat prose fields like `message` / `report`. That forces claws to string-scrape human text instead of consuming stable machine-readable diagnostics. **Action:** emit structured per-check JSON (`name`, `status`, `summary`, `details`, and relevant typed fields such as sandbox fallback reason) while preserving the current human-readable report for text mode.
 24. **Plugin lifecycle init/shutdown test flakes under workspace-parallel execution** — dogfooding surfaced that `build_runtime_runs_plugin_lifecycle_init_and_shutdown` can fail under `cargo test --workspace` while passing in isolation because sibling tests race on tempdir-backed shell init script paths. This is test brittleness rather than a code-path regression, but it still destabilizes CI confidence and wastes diagnosis cycles. **Action:** isolate temp resources per test robustly (unique dirs + no shared cwd assumptions), audit cleanup timing, and add a regression guard so the plugin lifecycle test remains stable under parallel workspace execution.
 **P3 — Swarm efficiency**
 13. Swarm branch-lock protocol — **done**: `branch_lock::detect_branch_lock_collisions()` now detects same-branch/same-scope and nested-module collisions before parallel lanes drift into duplicate implementation
 14. Commit provenance / worktree-aware push events — **done**: lane event provenance now includes branch/worktree/superseded/canonical lineage metadata, and manifest persistence de-dupes superseded commit events before downstream consumers render them