Auto compaction was keying off cumulative usage and re-summarizing from the front of the session, which made long chats shed continuity after the first compaction. The runtime now compacts against the current turn's prompt pressure and preserves prior compacted context as retained summary state instead of treating it like disposable history.
Constraint: Existing /compact behavior and saved-session resume flow had to keep working without schema changes
Rejected: Keep using cumulative input tokens | caused repeat compaction after every subsequent turn once the threshold was crossed
Rejected: Re-summarize prior compacted system messages as ordinary history | degraded continuity and could drop earlier context
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Preserve compacted-summary boundaries when extending compaction again; do not fold prior compacted context back into raw-message removal
Tested: cargo fmt --check; cargo clippy -p runtime -p commands --tests -- -D warnings; cargo test -p runtime; cargo test -p commands
Not-tested: End-to-end interactive CLI auto-compaction against a live Anthropic session
The runtime now auto-compacts completed conversations once cumulative input usage
crosses a configurable threshold, preserving recent context while surfacing an
explicit user notice. The CLI also publishes the requested ant-only slash
commands through the shared commands crate and main dispatch, using meaningful
local implementations for commit/PR/issue/teleport/debug workflows.
Constraint: Reuse the existing Rust compaction pipeline instead of introducing a new summarization stack
Constraint: No new dependencies or broad command-framework rewrite
Rejected: Implement API-driven compaction inside ConversationRuntime now | too much new plumbing for this delivery
Rejected: Expose new commands as parse-only stubs | would not satisfy the requested command availability
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: If runtime later gains true API-backed compaction, preserve the TurnSummary auto-compaction metadata shape so CLI call sites stay stable
Tested: cargo test; cargo build --release; cargo fmt --all; git diff --check; LSP diagnostics directory check
Not-tested: Live Anthropic-backed specialist command flows; gh-authenticated PR/issue creation in a real repo
This threads typed hook settings through runtime config, adds a shell-based hook runner, and executes PreToolUse/PostToolUse around each tool call in the conversation loop. The CLI now rebuilds runtimes with settings-derived hook configuration so user-defined Claude hook commands actually run before and after tools.
Constraint: Hook behavior needed to match Claude-style settings.json hooks without broad plugin/MCP parity work in this change
Rejected: Delay hook loading to the tool executor layer | would miss denied tool calls and duplicate runtime policy plumbing
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Keep hook execution in the runtime loop so permission decisions and tool results remain wrapped by the same conversation semantics
Tested: cargo test; cargo build --release
Not-tested: Real user hook scripts outside the test harness; broader plugin/skills parity
This threads typed hook settings through runtime config, adds a shell-based hook runner, and executes PreToolUse/PostToolUse around each tool call in the conversation loop. The CLI now rebuilds runtimes with settings-derived hook configuration so user-defined Claw hook commands actually run before and after tools.
Constraint: Hook behavior needed to match Claw-style settings.json hooks without broad plugin/MCP parity work in this change
Rejected: Delay hook loading to the tool executor layer | would miss denied tool calls and duplicate runtime policy plumbing
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Keep hook execution in the runtime loop so permission decisions and tool results remain wrapped by the same conversation semantics
Tested: cargo test; cargo build --release
Not-tested: Real user hook scripts outside the test harness; broader plugin/skills parity
This adds a small runtime sandbox policy/status layer, threads
sandbox options through the bash tool, and exposes `/sandbox`
status reporting in the CLI. Linux namespace/network isolation
is best-effort and intentionally reported as requested vs active
so the feature does not overclaim guarantees on unsupported
hosts or nested container environments.
Constraint: No new dependencies for isolation support
Constraint: Must keep filesystem restriction claims honest unless hard mount isolation succeeds
Rejected: External sandbox/container wrapper | too heavy for this workspace and request
Rejected: Inline bash-only changes without shared status model | weaker testability and poorer CLI visibility
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Treat this as observable best-effort isolation, not a hard security boundary, unless stronger mount enforcement is added later
Tested: cargo fmt --all; cargo clippy --workspace --all-targets --all-features -- -D warnings; cargo test --workspace
Not-tested: Manual `/sandbox` REPL run on a real nested-container host
Extended thinking needed to travel end-to-end through the API,
runtime, and CLI so the client can request a thinking budget,
preserve streamed reasoning blocks, and present them in a
collapsed text-first form. The implementation keeps thinking
strictly opt-in, adds a session-local toggle, and reuses the
existing flag/slash-command/reporting surfaces instead of
introducing a new UI layer.
Constraint: Existing non-thinking text/tool flows had to remain backward compatible by default
Constraint: Terminal UX needed a lightweight collapsed representation rather than an interactive TUI widget
Rejected: Heuristic CLI-only parsing of reasoning text | brittle against structured stream payloads
Rejected: Expanded raw thinking output by default | too noisy for normal assistant responses
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Keep thinking blocks structurally separate from answer text unless the upstream API contract changes
Tested: cargo fmt --all; cargo clippy --workspace --all-targets -- -D warnings; cargo test -q
Not-tested: Live upstream thinking payloads against the production API contract
Extended thinking needed to travel end-to-end through the API,
runtime, and CLI so the client can request a thinking budget,
preserve streamed reasoning blocks, and present them in a
collapsed text-first form. The implementation keeps thinking
strictly opt-in, adds a session-local toggle, and reuses the
existing flag/slash-command/reporting surfaces instead of
introducing a new UI layer.
Constraint: Existing non-thinking text/tool flows had to remain backward compatible by default
Constraint: Terminal UX needed a lightweight collapsed representation rather than an interactive TUI widget
Rejected: Heuristic CLI-only parsing of reasoning text | brittle against structured stream payloads
Rejected: Expanded raw thinking output by default | too noisy for normal assistant responses
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Keep thinking blocks structurally separate from answer text unless the upstream API contract changes
Tested: cargo fmt --all; cargo clippy --workspace --all-targets -- -D warnings; cargo test -q
Not-tested: Live upstream thinking payloads against the production API contract
This change makes compaction summaries durable under .claude/memory,
feeds those saved memory files back into prompt context, updates /memory
to report both instruction and project-memory files, and moves TodoWrite
persistence to a human-readable .claude/todos.md file.
Constraint: Reuse existing compaction, prompt loading, and slash-command plumbing rather than add a new subsystem
Constraint: Keep persisted project state under Claude-local .claude/ paths
Rejected: Introduce a dedicated memory service module | larger diff with no clear user benefit for this task
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Project memory files are loaded as prompt context, so future format changes must preserve concise readable content
Tested: cargo fmt --all --manifest-path rust/Cargo.toml
Tested: cargo clippy --manifest-path rust/Cargo.toml --all-targets --all-features -- -D warnings
Tested: cargo test --manifest-path rust/Cargo.toml --all
Not-tested: Long-term retention/cleanup policy for .claude/memory growth
This change makes compaction summaries durable under .claw/memory,
feeds those saved memory files back into prompt context, updates /memory
to report both instruction and project-memory files, and moves TodoWrite
persistence to a human-readable .claw/todos.md file.
Constraint: Reuse existing compaction, prompt loading, and slash-command plumbing rather than add a new subsystem
Constraint: Keep persisted project state under Claw-local .claw/ paths
Rejected: Introduce a dedicated memory service module | larger diff with no clear user benefit for this task
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Project memory files are loaded as prompt context, so future format changes must preserve concise readable content
Tested: cargo fmt --all --manifest-path rust/Cargo.toml
Tested: cargo clippy --manifest-path rust/Cargo.toml --all-targets --all-features -- -D warnings
Tested: cargo test --manifest-path rust/Cargo.toml --all
Not-tested: Long-term retention/cleanup policy for .claw/memory growth
The Rust CLI now stores managed sessions under ~/.claude/sessions,
records additive session metadata in the canonical JSON transcript,
and exposes a /sessions listing alias alongside ID-or-path resume.
Inactive oversized sessions are compacted automatically so old
transcripts remain resumable without growing unchecked.
Constraint: Session JSON must stay backward-compatible with legacy files that lack metadata
Constraint: Managed sessions must use a single canonical JSON file per session without new dependencies
Rejected: Sidecar metadata/index files | duplicated state and diverged from the requested single-file persistence model
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep CLI policy in the CLI; only add transcript-adjacent metadata to runtime::Session unless another consumer truly needs more
Tested: cargo fmt; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: Manual interactive REPL smoke test against the live Anthropic API
The Rust CLI now stores managed sessions under ~/.claw/sessions,
records additive session metadata in the canonical JSON transcript,
and exposes a /sessions listing alias alongside ID-or-path resume.
Inactive oversized sessions are compacted automatically so old
transcripts remain resumable without growing unchecked.
Constraint: Session JSON must stay backward-compatible with legacy files that lack metadata
Constraint: Managed sessions must use a single canonical JSON file per session without new dependencies
Rejected: Sidecar metadata/index files | duplicated state and diverged from the requested single-file persistence model
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Keep CLI policy in the CLI; only add transcript-adjacent metadata to runtime::Session unless another consumer truly needs more
Tested: cargo fmt; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: Manual interactive REPL smoke test against the live Anthropic API