docs(roadmap): record cascade-masking pinpoint under green-ness contract (#9)

Concrete follow-up captured from today's dogfood session:

A single hung test (oversized-request preflight, 6 minutes per attempt
after `be561bf` silently swallowed count_tokens errors) crashed the
`cargo test --workspace` job before downstream crates could run, hiding
6 separate pre-existing CLI regressions until `8c6dfe5` + `5851f2d`
restored the fast-fail path.

Two new acceptance criteria for #9:
- per-test timeouts in CI so one hang cannot mask other failures
- distinguish `test.hung` from generic test failures in worker reports
This commit is contained in:
YeonGyu-Kim
2026-04-08 15:03:30 +09:00
parent 5851f2dee8
commit 7d90283cf9

View File

@@ -194,6 +194,20 @@ Workers should distinguish:
Acceptance:
- no more ambiguous "tests passed" messaging
- merge policy can require the correct green level for the lane type
- a single hung test must not mask other failures: enforce per-test
timeouts in CI (`cargo test --workspace`) so a 6-minute hang in one
crate cannot prevent downstream crates from running their suites
- when a CI job fails because of a hang, the worker must report it as
`test.hung` rather than a generic failure, so triage doesn't conflate
it with a normal `assertion failed`
- recorded pinpoint (2026-04-08): `be561bf` swapped the local
byte-estimate preflight for a `count_tokens` round-trip and silently
returned `Ok(())` on any error, so `send_message_blocks_oversized_*`
hung for ~6 minutes per attempt; the resulting workspace job crash
hid 6 *separate* pre-existing CLI regressions (compact flag
discarded, piped stdin vs permission prompter, legacy session layout,
help/prompt assertions, mock harness count) that only became
diagnosable after `8c6dfe5` + `5851f2d` restored the fast-fail path
## Phase 4 — Claws-First Task Execution