Expand parity harness coverage before behavioral drift lands

The landed mock Anthropic harness now covers multi-tool turns, bash flows,
permission prompt approve/deny paths, and an external plugin tool path.
A machine-readable scenario manifest plus a diff/checklist runner keep the
new scenarios tied back to PARITY.md so future additions stay honest.

Constraint: Must build on the deterministic mock service and clean-environment CLI harness
Rejected: Add an MCP tool scenario now | current MCP tool surface is still stubbed, so plugin coverage is the real executable path
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep rust/mock_parity_scenarios.json, mock_parity_harness.rs, and PARITY.md refs in lockstep
Tested: cargo fmt --all
Tested: cargo clippy --workspace --all-targets -- -D warnings
Tested: cargo test --workspace
Tested: python3 rust/scripts/run_mock_parity_diff.py
Not-tested: Real MCP lifecycle handshakes; remote plugin marketplace install flows
This commit is contained in:
Yeachan-Heo
2026-04-03 04:00:33 +00:00
parent c2f1304a01
commit 85c5b0e01d
7 changed files with 1154 additions and 100 deletions

View File

@@ -1,6 +1,6 @@
# Parity Status — claw-code Rust Port
Last updated: 2026-04-03 (`03bd7f0`)
Last updated: 2026-04-03
## Mock parity harness — milestone 1
@@ -8,6 +8,24 @@ Last updated: 2026-04-03 (`03bd7f0`)
- [x] Reproducible clean-environment CLI harness (`rust/crates/rusty-claude-cli/tests/mock_parity_harness.rs`)
- [x] Scripted scenarios: `streaming_text`, `read_file_roundtrip`, `grep_chunk_assembly`, `write_file_allowed`, `write_file_denied`
## Mock parity harness — milestone 2 (behavioral expansion)
- [x] Scripted multi-tool turn coverage: `multi_tool_turn_roundtrip`
- [x] Scripted bash coverage: `bash_stdout_roundtrip`
- [x] Scripted permission prompt coverage: `bash_permission_prompt_approved`, `bash_permission_prompt_denied`
- [x] Scripted plugin-path coverage: `plugin_tool_roundtrip`
- [x] Behavioral diff/checklist runner: `rust/scripts/run_mock_parity_diff.py`
## Harness v2 behavioral checklist
Canonical scenario map: `rust/mock_parity_scenarios.json`
- Multi-tool assistant turns
- Bash flow roundtrips
- Permission enforcement across tool paths
- Plugin tool execution path
- File tools — harness-validated flows
## Tool Surface: 40/40 (spec parity)
### Real Implementations (behavioral parity — varying depth)
@@ -79,17 +97,23 @@ Last updated: 2026-04-03 (`03bd7f0`)
- [ ] `modeValidation` — validate against current permission mode
- [ ] `shouldUseSandbox` — sandbox decision logic
Harness note: milestone 2 validates bash success plus workspace-write escalation approve/deny flows, but the deeper validation/security submodules above are still open.
**File tools — need verification:**
- [ ] Path traversal prevention (symlink following, ../ escapes)
- [ ] Size limits on read/write
- [ ] Binary file detection
- [ ] Permission mode enforcement (read-only vs workspace-write)
Harness note: read_file, grep_search, write_file allow/deny, and multi-tool same-turn assembly are now covered by the mock parity harness.
**Config/Plugin/MCP flows:**
- [ ] Full MCP server lifecycle (connect, list tools, call tool, disconnect)
- [ ] Plugin install/enable/disable/uninstall full flow
- [ ] Config merge precedence (user > project > local)
Harness note: external plugin discovery + execution is now covered via `plugin_tool_roundtrip`; full lifecycle and MCP behavior remain open.
## Runtime Behavioral Gaps
- [ ] Permission enforcement across all tools (read-only, workspace-write, danger-full-access)
@@ -98,6 +122,8 @@ Last updated: 2026-04-03 (`03bd7f0`)
- [ ] Token counting / cost tracking accuracy
- [x] Streaming response support validated by the mock parity harness
Harness note: current coverage now includes write-file denial, bash escalation approve/deny, and plugin workspace-write execution paths.
## Migration Readiness
- [ ] `PARITY.md` maintained and honest