Lock down CLI-to-mock behavioral parity for Anthropic flows

This adds a deterministic mock Anthropic-compatible /v1/messages service, a clean-environment CLI harness, and repo docs so the first parity milestone can be validated without live network dependencies. Constraint: First milestone must prove Rust claw can connect from a clean environment and cover streaming, tool assembly, and permission/tool flow Constraint: No new third-party dependencies; reuse the existing Rust workspace stack Rejected: Record/replay live Anthropic traffic | nondeterministic and unsuitable for repeatable CI coverage Confidence: high Scope-risk: moderate Reversibility: clean Directive: Keep scenario markers and expected tool payload shapes synchronized between the mock service and the harness tests Tested: cargo fmt --all Tested: cargo clippy --workspace --all-targets -- -D warnings Tested: cargo test --workspace Tested: ./scripts/run_mock_parity_harness.sh Not-tested: Live Anthropic responses beyond the five scripted harness scenarios
2026-07-18 12:58:25 +08:00 · 2026-04-03 01:15:52 +00:00
parent 1abd951e57
commit c2f1304a01
10 changed files with 1115 additions and 2 deletions
@@ -35,6 +35,34 @@ Or authenticate via OAuth:
 claw login
 ```

+## Mock parity harness
+
+The workspace now includes a deterministic Anthropic-compatible mock service and a clean-environment CLI harness for end-to-end parity checks.
+
+```bash
+cd rust/
+
+# Run the scripted clean-environment harness
+./scripts/run_mock_parity_harness.sh
+
+# Or start the mock service manually for ad hoc CLI runs
+cargo run -p mock-anthropic-service -- --bind 127.0.0.1:0
+```
+
+Harness coverage:
+
+- `streaming_text`
+- `read_file_roundtrip`
+- `grep_chunk_assembly`
+- `write_file_allowed`
+- `write_file_denied`
+
+Primary artifacts:
+
+- `crates/mock-anthropic-service/` — reusable mock Anthropic-compatible service
+- `crates/rusty-claude-cli/tests/mock_parity_harness.rs` — clean-env CLI harness
+- `scripts/run_mock_parity_harness.sh` — reproducible wrapper
+
 ## Features

 | Feature | Status |
@@ -124,6 +152,7 @@ rust/
    ├── api/                # Anthropic API client + SSE streaming
    ├── commands/           # Shared slash-command registry
    ├── compat-harness/     # TS manifest extraction harness
+    ├── mock-anthropic-service/ # Deterministic local Anthropic-compatible mock
    ├── runtime/            # Session, config, permissions, MCP, prompts
    ├── rusty-claude-cli/   # Main CLI binary (`claw`)
    └── tools/              # Built-in tool implementations
@@ -134,6 +163,7 @@ rust/
 - **api** — HTTP client, SSE stream parser, request/response types, auth (API key + OAuth bearer)
 - **commands** — Slash command definitions and help text generation
 - **compat-harness** — Extracts tool/prompt manifests from upstream TS source
+- **mock-anthropic-service** — Deterministic `/v1/messages` mock for CLI parity tests and local harness runs
 - **runtime** — `ConversationRuntime` agentic loop, `ConfigLoader` hierarchy, `Session` persistence, permission policy, MCP client, system prompt assembly, usage tracking
 - **rusty-claude-cli** — REPL, one-shot prompt, streaming display, tool call rendering, CLI argument parsing
 - **tools** — Tool specs + execution: Bash, ReadFile, WriteFile, EditFile, GlobSearch, GrepSearch, WebSearch, WebFetch, Agent, TodoWrite, NotebookEdit, Skill, ToolSearch, REPL runtimes
@@ -141,7 +171,7 @@ rust/
 ## Stats

 - **~20K lines** of Rust
- **6 crates** in workspace
+- **7 crates** in workspace
 - **Binary name:** `claw`
 - **Default model:** `claude-opus-4-6`
 - **Default permissions:** `danger-full-access`