Commit Graph

151 Commits

Author SHA1 Message Date
Bellman
b90f18f75a Merge pull request #3214 from TheArchitectit/worktree-api-timeout-retry-v2
feat: API timeout config, Retry-After header, configurable retry, and 400 transient retry
2026-06-05 10:33:35 +09:00
bellman
d4aad7103e fix: add actionable auth hint to 401/403 API errors (#28)
401 and 403 errors now include a hint explaining which env vars to
check for each provider (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.)
and suggesting claw doctor for credential verification.

Generated with https://github.com/Yeachan-Heo/gajae-code
Co-authored-by: Gajae Code <dev@gajae-code.com>
2026-06-05 10:10:24 +09:00
TheArchitectit
9e50cb6e20 Merge remote-tracking branch 'upstream/main' into worktree-api-timeout-retry-v2
# Conflicts:
#	rust/crates/runtime/src/config.rs
#	rust/crates/runtime/src/lib.rs
2026-06-04 09:17:43 -05:00
TheArchitectit
76783377ec fix: address CI failures and reviewer feedback on #3214
- Add missing retry_after: None field to ApiError::Api construction
  in main.rs test. This field was introduced by the Retry-After
  header support but was not added to the test's error initializer,
  causing a compile error under CI's strict mode.

- Remove duplicate #[must_use] attribute on retry_after() method
  in error.rs (lines 134+138 both had it; kept the outer one
  above the doc comment per convention).

- Cargo fmt --all run.

- Reviewer question "Are defaults preserved?" — answered yes:
  ApiTimeoutConfig defaults to 30s connect / 300s request / 8 retries.
  with_retry_policy() is opt-in. No behavior change without explicit
  configuration.
2026-06-03 13:19:25 -05:00
bellman
fa35018769 fix: validate env model selection 2026-06-04 00:30:13 +09:00
bellman
bcc5bfde9c fix: route local OpenAI-compatible models 2026-06-03 23:16:46 +09:00
bellman
c91a3062d5 fix: normalize Anthropic model routing 2026-06-03 22:20:23 +09:00
bellman
54d785d0c0 fix: preserve DeepSeek V4 thinking history 2026-06-03 21:53:54 +09:00
TheArchitectit
04bc5f5788 feat: API timeout config, Retry-After header, configurable retry, and 400 transient retry
Cherry-picked from PR #2816 onto current upstream/main, resolving
conflicts from PR #3015's merge (which added retry_after to ApiError
but some construction sites were missing it).

Commits preserved:
- ade85398: API timeout config, Retry-After header, configurable retry
  - TimeoutConfig in HTTP client builder (connect 30s, request 5min)
  - CLAW_API_CONNECT_TIMEOUT and CLAW_API_REQUEST_TIMEOUT env vars
  - Retry-After header parsing on 429 responses
  - ApiTimeoutConfig in runtime config (settings.json)
- 8a883430: retry 400 responses with transient gateway error bodies
  - Detects known gateway phrases in 400 response bodies
  - Marks them as retryable instead of hard-failing
- ed91a61e: add 'no parseable body' to CONTEXT_WINDOW_ERROR_MARKERS
  - Some providers return 400 with 'no parseable body' for oversized
    requests instead of a proper context_length_exceeded error

Commits skipped (already in upstream via PR #3015):
- 453ab642: optional id field (already merged)
- baa8d1ba: HTML detection in streaming (already merged)
- 33d2f789: JSON error detection in streaming (already merged)

8 files changed, 299 insertions, 80 deletions
2026-06-02 15:35:29 -05:00
TheArchitectit
571d3cdc0f fix: add "no parseable body" to CONTEXT_WINDOW_ERROR_MARKERS
Some OpenAI-compat backends (e.g. glm-5.1-fast) return 400 with
"no parseable body" when the request payload is too large to parse,
rather than a proper context_length_exceeded error. Without this marker,
is_context_window_error() returns false and the auto-compact retry
loop never triggers — the user just sees an opaque 400 error.

💘 Generated with Crush

Assisted-by: GLM 5.1 FP8 via Crush <crush@charm.land>
2026-06-02 15:31:04 -05:00
TheArchitectit
414a1aca4f fix: retry 400 responses with transient gateway error bodies
Some providers/proxies return HTTP 400 with bodies like "no parseable
body" or "connection reset" during transient network blips. These are
not real bad requests — they're gateway errors wearing a 400 mask.
Detect known gateway error phrases in 400 response bodies and mark
them as retryable so the existing exponential backoff handles them.
2026-06-02 15:30:41 -05:00
TheArchitectit
d8c57ed317 feat: API timeout config, Retry-After header support, and configurable retry
- Add TimeoutConfig to HTTP client builder with connect_timeout (30s)
  and request_timeout (5min) defaults, configurable via
  CLAW_API_CONNECT_TIMEOUT and CLAW_API_REQUEST_TIMEOUT env vars
- Add with_timeout() builder to both AnthropicClient and
  OpenAiCompatClient for per-client timeout configuration
- Parse Retry-After header on 429 responses and use it to override
  exponential backoff delay when present
- Add ApiTimeoutConfig to runtime config with apiTimeout settings
  in ~/.claw/settings.json (connectTimeout, requestTimeout, maxRetries)
- Add retry_after field to ApiError::Api for propagating rate limit
  backoff hints through the retry pipeline
2026-06-02 15:30:22 -05:00
YeonGyu-Kim
c70312bd04 fix(#754): missing_credentials hint now newline-delimited so JSON hint field is non-null 2026-05-26 21:23:03 +09:00
YeonGyu-Kim
63a5a87471 fix(#696): exit with typed error when stdin is not a TTY and no prompt piped; fix anthropic/ prefix detection in metadata_for_model 2026-05-25 13:16:12 +09:00
YeonGyu-Kim
78a0ff615a Merge pull request #3014 from wangguan1995/fix_qwen
Add Qwen model token limits for DashScope compatibility
2026-05-25 12:58:59 +09:00
YeonGyu-Kim
6f5465aeaf fix(test): update client_integration version string 0.1.0 -> 0.1.3 2026-05-25 12:49:36 +09:00
Yeachan-Heo
fdbc789694 fix(api): skip preflight for unknown model limits 2026-05-25 12:49:36 +09:00
Yeachan-Heo
779cf1c234 test(api): fill thinking in stream chunk fixtures 2026-05-25 12:49:36 +09:00
YeonGyu-Kim
495e7a015c fix: remove stale retry_after field, Team variant, config_load_error_kind, denied_tools initializer errors
- Remove retry_after: None from ApiError::Api structs in openai_compat.rs (field was removed)
- Remove SlashCommand::Team parse arm (variant was removed from enum)
- Add config_load_error_kind: None to doctor path StatusContext initializer
- Add Thinking arm to all ContentBlock match blocks in trident.rs
- Remove cargo fmt drift across commands, config, compact, tools, trident
2026-05-25 12:01:09 +09:00
YeonGyu-Kim
3364dc4bee chore: fix conflict markers and cargo fmt drift in main (commands, openai_compat, trident, config, tools) 2026-05-25 11:51:44 +09:00
TheArchitectit
7149bbc3d9 fix: streaming robustness — OpenAI parsing, error detection, reasoning content
Improves SSE parsing with raw JSON error detection, HTML response detection (for misconfigured endpoints), thinking/reasoning content from provider-specific delta fields, #[serde(default)] on streaming types for lenient deserialization, compact session boundary guard, and /team slash command. Adds install.sh convenience script.
2026-05-25 11:22:47 +09:00
Ajinkya Kardile
b071fac2cf feat: add native Gemini support to openai_compat provider
Adds early return in wire_model_for_base_url for Gemini/Gemma/XAI/Kimi/Grok model prefixes to ensure the provider prefix is preserved correctly when routing through the OpenAI-compatible provider path.
2026-05-25 11:21:37 +09:00
Luke
739488f613 fix: return conservative token limits for unspecified models
Changes the catch-all arm in model_token_limit() from None to conservative defaults (max_output_tokens: 16_384, context_window_tokens: 131_072) to prevent crashes when an unknown model is used.
2026-05-25 11:21:22 +09:00
Luke
a61d023583 fix: unify user_agent to 'clawd-rust-tools/0.1'
Sets user_agent on both build_http_client_or_default() and build_http_client_with() to 'clawd-rust-tools/0.1' for consistent HTTP client identification.
2026-05-25 11:21:13 +09:00
bellman
04c2abb412 Stabilize final gate before release checkpoint
Resolve the G012 evidence gate by fixing permission-mode regressions, platform-sensitive tests, and the clippy surface that blocked an all-targets verification run.

Constraint: G012 final gate required docs, board, full workspace tests, and clippy -D warnings evidence before checkpointing.

Rejected: documenting the worker-2 gate failure as an accepted gap | the failing tests and lints were locally reproducible and fixable.

Confidence: high

Scope-risk: moderate

Directive: Preserve read-only permission requirements for read/glob/grep tools; write/edit remain workspace-write or danger-full-access when outside the workspace.

Tested: python3 .github/scripts/check_doc_source_of_truth.py; python3 .github/scripts/check_release_readiness.py; python3 scripts/validate_cc2_board.py --board .omx/cc2/board.json; python3 .omx/cc2/validate_issue_parity_intake.py .omx/cc2/issue-parity-intake.json; cargo fmt --manifest-path rust/Cargo.toml --all -- --check; cargo check --manifest-path rust/Cargo.toml --workspace; cargo test --manifest-path rust/Cargo.toml --workspace -- --nocapture; cargo clippy --manifest-path rust/Cargo.toml --workspace --all-targets -- -D warnings

Not-tested: live network provider smoke tests and remote PR/issue mutations.
2026-05-15 13:34:57 +09:00
bellman
8c9a05e71b Restore provider compatibility diagnostics as API types
Keep the G008 capability and diagnostic helpers compile-ready by restoring the public report/support/severity types that team integrations referenced after merge reconciliation.

Constraint: Final G008 verification failed on missing provider capability and diagnostic type definitions.
Confidence: high
Scope-risk: narrow
Directive: Keep provider diagnostics exported as typed API surfaces; do not replace them with ad-hoc JSON-only status fields.
Tested: cargo fmt --manifest-path rust/Cargo.toml --all -- --check; git diff --check; cargo test --manifest-path rust/Cargo.toml -p api providers:: -- --nocapture --test-threads=1; cargo test --manifest-path rust/Cargo.toml -p api --test openai_compat_integration -- --nocapture --test-threads=1
Not-tested: full workspace clippy; known unrelated runtime policy_engine struct_excessive_bools remains outside G008.

Co-authored-by: OmX <omx@oh-my-codex.dev>
2026-05-15 10:37:20 +09:00
bellman
dccb3e72d9 Stabilize OpenAI-compatible mock transport verification
Keep the mock HTTP/SSE/proxy coverage deterministic under strict linting while preserving provider request behavior.\n\nConstraint: Task 4 scope is limited to OpenAI-compatible HTTP/SSE/proxy coverage and provider compatibility surfaces.\nRejected: Environment-variable proxy testing | It races with parallel integration tests and can route unrelated localhost mocks through a single proxy fixture.\nConfidence: high\nScope-risk: narrow\nDirective: Prefer explicit injected reqwest clients for proxy integration tests instead of mutating process proxy environment.\nTested: cargo fmt --check; cargo check -p api; cargo test -p api --test openai_compat_integration -- --nocapture; cargo test -p api\nNot-tested: cargo clippy --no-deps -p api --all-targets -- -D warnings fails on pre-existing anthropic.rs/providers/mod.rs lints outside task scope.\n\nCo-authored-by: OmX <omx@local>
2026-05-15 10:30:19 +09:00
bellman
ea95bf2576 omx(team): auto-checkpoint worker-3 [unknown] 2026-05-15 10:30:16 +09:00
bellman
dec8efa5c8 omx(team): auto-checkpoint worker-1 [1] 2026-05-15 10:30:09 +09:00
bellman
ce02ace3a2 omx(team): auto-checkpoint worker-1 [1] 2026-05-15 10:30:06 +09:00
bellman
bc32639ce3 omx(team): auto-checkpoint worker-1 [1] 2026-05-15 10:30:03 +09:00
bellman
a212c662e5 omx(team): auto-checkpoint worker-1 [1] 2026-05-15 10:30:00 +09:00
bellman
2cac66cd38 Stabilize provider compatibility integration verification
Keep integrated G008 provider changes formatted and compile-ready so worker follow-up commits can merge against a clean leader baseline.

Constraint: G008 provider verification must pass before ultragoal checkpointing.
Confidence: high
Scope-risk: narrow
Directive: Keep provider compatibility follow-ups rebased on this formatted baseline before retrying failed cherry-picks.
Tested: cargo test --manifest-path rust/Cargo.toml -p api providers:: -- --nocapture; cargo test --manifest-path rust/Cargo.toml -p api --test openai_compat_integration -- --nocapture --test-threads=1
Not-tested: full workspace clippy; known pre-existing runtime policy_engine LaneContext clippy warning remains outside this change.

Co-authored-by: OmX <omx@oh-my-codex.dev>
2026-05-15 10:28:50 +09:00
bellman
685f078204 omx(team): auto-checkpoint worker-1 [1] 2026-05-15 10:23:37 +09:00
bellman
e4ef0f7f19 omx(team): auto-checkpoint worker-4 [unknown] 2026-05-15 10:22:03 +09:00
bellman
76581f7239 omx(team): auto-checkpoint worker-3 [unknown] 2026-05-15 10:21:58 +09:00
bellman
82ec223ed4 omx(team): auto-checkpoint worker-2 [unknown] 2026-05-15 10:21:55 +09:00
bellman
a6ca5c489b omx(team): auto-checkpoint worker-4 [unknown] 2026-05-15 10:21:28 +09:00
bellman
3ff8743e79 omx(team): auto-checkpoint worker-2 [unknown] 2026-05-15 10:21:23 +09:00
bellman
29029bfc14 omx(team): auto-checkpoint worker-1 [1] 2026-05-15 10:21:18 +09:00
wangguan1995
8cada12c48 Add Qwen model token limits for DashScope compatibility 2026-05-10 13:09:07 +00:00
YeonGyu-Kim
75c08bc982 fix: REPL display, /compact panic, identity leak, DeepSeek reasoning, thinking blocks
Five interrelated fixes from parallel Hephaestus sessions:

1. fix(repl): display assistant text after spinner (#2981, #2982, #2937)
   - Added final_assistant_text() call after run_turn spinner completes
   - REPL now shows response text like run_prompt_json does

2. fix(compact): handle Thinking content blocks (#2985)
   - Added ContentBlock::Thinking variant throughout compact summarizer
   - Prevents panic when /compact encounters thinking blocks

3. fix(prompt): provider-aware model identity (#2822)
   - New ModelFamilyIdentity enum (Claude vs Generic)
   - Non-Anthropic models no longer say 'I am Claude'
   - model_family_identity_for() detects provider and sets identity

4. fix(openai): preserve DeepSeek reasoning_content (#2821)
   - Stream parser now captures reasoning_content from OpenAI-compat
   - Emits ThinkingDelta/SignatureDelta events for reasoning models
   - Thinking blocks included in conversation history for re-send

5. feat(runtime): Thinking block support across codebase
   - AssistantEvent::Thinking variant in conversation.rs
   - ContentBlock::Thinking in session serialization
   - Thinking-aware compact summarization
   - Tests for thinking block ordering and content

Closes #2981, #2982, #2937, #2985, #2822, #2821
2026-05-06 15:32:34 +09:00
Andreas Haida
9a512633a5 Cap OpenAI default output tokens using model metadata 2026-05-03 22:16:12 +02:00
Andreas Haida
6ac13ffdad Handle OpenAI token-limit errors as context-window failures 2026-05-03 22:16:12 +02:00
Yeachan-Heo
74ea754d29 Restore Rust formatting compliance
Run rustfmt from the Rust workspace so CI format checks pass without changing behavior.

Constraint: Scope is formatting-only across tracked Rust files

Confidence: high

Scope-risk: narrow

Tested: cd rust && cargo fmt --check

Tested: git diff --check
2026-04-28 09:19:16 +00:00
Yeachan-Heo
00d0eb61d4 US-024: Add token limit metadata for kimi models
Add ModelTokenLimit entries for kimi-k2.5 and kimi-k1.5 to enable
preflight context window validation. Per Moonshot AI documentation:
- Context window: 256,000 tokens
- Max output: 16,384 tokens

Includes 3 unit tests:
- returns_context_window_metadata_for_kimi_models
- kimi_alias_resolves_to_kimi_k25_token_limits
- preflight_blocks_oversized_requests_for_kimi_models

All tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-17 04:15:38 +00:00
Yeachan-Heo
d037f9faa8 Fix strip_routing_prefix to handle kimi provider prefix (US-023)
Add "kimi" to the strip_routing_prefix matches so that models like
"kimi/kimi-k2.5" have their prefix stripped before sending to the
DashScope API (consistent with qwen/openai/xai/grok handling).

Also add unit test strip_routing_prefix_strips_kimi_provider_prefix.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 19:50:15 +00:00
Yeachan-Heo
cec8d17ca8 Implement US-023: Add automatic routing for kimi models to DashScope
Changes in rust/crates/api/src/providers/mod.rs:
- Add 'kimi' alias to MODEL_REGISTRY resolving to 'kimi-k2.5' with DashScope config
- Add kimi/kimi- prefix routing to DashScope endpoint in metadata_for_model()
- Add resolve_model_alias() handling for kimi -> kimi-k2.5
- Add unit tests: kimi_prefix_routes_to_dashscope, kimi_alias_resolves_to_kimi_k2_5

Users can now use:
- --model kimi (resolves to kimi-k2.5)
- --model kimi-k2.5 (auto-routes to DashScope)
- --model kimi/kimi-k2.5 (explicit provider prefix)

All 127 tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 19:44:21 +00:00
Yeachan-Heo
4cb1db9faa Implement US-022: Enhanced error context for API failures
Add structured error context to API failures:
- Request ID tracking across retries with full context in error messages
- Provider-specific error code mapping with actionable suggestions
- Suggested user actions for common error types (401, 403, 413, 429, 500, 502-504)
- Added suggested_action field to ApiError::Api variant
- Updated enrich_bearer_auth_error to preserve suggested_action

Files changed:
- rust/crates/api/src/error.rs: Add suggested_action field, update Display
- rust/crates/api/src/providers/openai_compat.rs: Add suggested_action_for_status()
- rust/crates/api/src/providers/anthropic.rs: Update error handling

All tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 19:15:00 +00:00
Yeachan-Heo
5e65b33042 US-021: Add request body size pre-flight check for OpenAI-compatible provider 2026-04-16 17:41:57 +00:00