mirror of
https://github.com/instructkr/claw-code.git
synced 2026-04-09 01:24:49 +08:00
fix(api): restore local preflight guard ahead of count_tokens round-trip
CI has been red sincebe561bf('Use Anthropic count tokens for preflight') because that commit replaced the free-function preflight_message_request (byte-estimate guard) with an instance method that silently returns Ok on any count_tokens failure: let counted_input_tokens = match self.count_tokens(request).await { Ok(count) => count, Err(_) => return Ok(()), // <-- silent bypass }; Two consequences: 1. client_integration::send_message_blocks_oversized_requests_before_the_http_call has been FAILING on every CI run sincebe561bf. The mock server in that test only has one HTTP response queued (a bare '{}' to satisfy the main request), so the count_tokens POST parses into an empty body that fails to deserialize into CountTokensResponse -> Err -> silent bypass -> the oversized 600k-char request proceeds to the mock instead of being rejected with ContextWindowExceeded as the test expects. 2. In production, any third-party Anthropic-compatible gateway that doesn't implement /v1/messages/count_tokens (OpenRouter, Cloudflare AI Gateway, etc.) would silently disable the preflight guard entirely, letting oversized requests hit the upstream only to fail there with a provider- side context-window error. This is exactly the 'opaque failure surface' ROADMAP #22 asked us to avoid. Fix: call the free-function super::preflight_message_request(request)? as the first step in the instance method, before any network round-trip. This guarantees the byte-estimate guard always fires, whether or not the remote count_tokens endpoint is reachable. The count_tokens refinement still runs afterward when available for more precise token counting, but it is now strictly additive — it can only catch more cases, never silently skip the guard. Test results: - cargo test -p api --lib: 89 passed, 0 failed - cargo test --release -p api (all test binaries): 118 passed, 0 failed - cargo test --release -p api --test client_integration \ send_message_blocks_oversized_requests_before_the_http_call: passes - cargo fmt --check: clean This unblocks the Rust CI workflow which has been red on every push sincebe561bflanded.
This commit is contained in:
@@ -487,10 +487,21 @@ impl AnthropicClient {
|
|||||||
}
|
}
|
||||||
|
|
||||||
async fn preflight_message_request(&self, request: &MessageRequest) -> Result<(), ApiError> {
|
async fn preflight_message_request(&self, request: &MessageRequest) -> Result<(), ApiError> {
|
||||||
|
// Always run the local byte-estimate guard first. This catches
|
||||||
|
// oversized requests even if the remote count_tokens endpoint is
|
||||||
|
// unreachable, misconfigured, or unimplemented (e.g., third-party
|
||||||
|
// Anthropic-compatible gateways). If byte estimation already flags
|
||||||
|
// the request as oversized, reject immediately without a network
|
||||||
|
// round trip.
|
||||||
|
super::preflight_message_request(request)?;
|
||||||
|
|
||||||
let Some(limit) = model_token_limit(&request.model) else {
|
let Some(limit) = model_token_limit(&request.model) else {
|
||||||
return Ok(());
|
return Ok(());
|
||||||
};
|
};
|
||||||
|
|
||||||
|
// Best-effort refinement using the Anthropic count_tokens endpoint.
|
||||||
|
// On any failure (network, parse, auth), fall back to the local
|
||||||
|
// byte-estimate result which already passed above.
|
||||||
let counted_input_tokens = match self.count_tokens(request).await {
|
let counted_input_tokens = match self.count_tokens(request).await {
|
||||||
Ok(count) => count,
|
Ok(count) => count,
|
||||||
Err(_) => return Ok(()),
|
Err(_) => return Ok(()),
|
||||||
|
|||||||
Reference in New Issue
Block a user