claw-code/rust/crates/api/tests/openai_compat_integration.rs at 9bd7a78ca85490560a325ac08375a7f3142b83eb

mirror of https://github.com/instructkr/claw-code.git synced 2026-04-07 00:24:50 +08:00

Files

Yeachan-Heo fa72cd665e Block oversized requests before providers hard-fail

The runtime already tracked rough token estimates for compaction, but provider-bound
requests still relied on naive model output limits and could be sent upstream even
when the selected model could not fit the estimated prompt plus requested output.

This adds a small model token/context registry in the API layer, estimates request
size from the serialized prompt payload, and fails locally with a dedicated
context-window error before Anthropic or xAI calls are made. Focused integration
coverage asserts the preflight fires before any HTTP request leaves the process.

Constraint: Keep the first pass minimal and reusable across both Anthropic and OpenAI-compatible providers
Rejected: Auto-compact-and-retry in the same patch | broader control-flow change than the requested minimal preflight
Confidence: medium
Scope-risk: narrow
Reversibility: clean
Directive: Expand the model registry before enabling preflight for additional providers or aliases
Tested: cargo build -p api -p tools -p rusty-claude-cli; cargo test -p api
Not-tested: End-to-end CLI auto-compaction or retry behavior after a local context_window_blocked failure

2026-04-05 16:39:58 +00:00

18 KiB

Raw Blame History

View Raw

18 KiB Raw Blame History

18 KiB

Raw Blame History