Commit Graph

12 Commits

Author SHA1 Message Date
YeonGyu-Kim
b513d6e462 fix(api): sanitize tuning params for reasoning models (o1/o3/grok-3-mini)
Reasoning models reject temperature, top_p, frequency_penalty, and
presence_penalty with 400 errors. Instead of letting these flow through
and returning cryptic provider errors, strip them silently at the
request-builder boundary.

is_reasoning_model() classifies: o1*, o3*, o4*, grok-3-mini.
stop sequences are preserved (safe for all providers).

Tests added:
- reasoning_model_strips_tuning_params: o1-mini strips all 4 params, keeps stop
- grok_3_mini_is_reasoning_model: classification coverage for grok-3-mini, o1,
  o3-mini, and negative cases (gpt-4o, grok-3, claude)

85 api lib tests passing, 0 failing.
2026-04-08 07:32:47 +09:00
YeonGyu-Kim
c667d47c70 feat(api): add tuning params (temperature, top_p, penalties, stop) to MessageRequest
MessageRequest was missing standard OpenAI-compatible generation tuning
parameters. Callers had no way to control temperature, top_p,
frequency_penalty, presence_penalty, or stop sequences.

Changes:
- Added 5 optional fields to MessageRequest (all Option, None by default)
- Wired into build_chat_completion_request: only included in payload when set
- All existing construction sites updated with ..Default::default()
- MessageRequest now derives Default for ergonomic partial construction

Tests added:
- tuning_params_included_in_payload_when_set: all 5 params flow into JSON
- tuning_params_omitted_from_payload_when_none: absent params stay absent

83 api lib tests passing, 0 failing.
cargo check --workspace: 0 warnings.
2026-04-08 07:07:33 +09:00
YeonGyu-Kim
5bcbc86a2b feat: b5-slash-help — batch 5 upstream parity 2026-04-07 14:51:27 +09:00
YeonGyu-Kim
6a6c5acb02 feat: b5-reasoning-guard — batch 5 upstream parity 2026-04-07 14:51:27 +09:00
YeonGyu-Kim
f982f24926 fix(api): Windows env hint + .env file loading fallback
When API key missing on Windows, hint about setx. Load .env from CWD
as fallback with simple key=value parser.
2026-04-07 14:22:41 +09:00
YeonGyu-Kim
2a642871ad fix(api): enrich JSON parse errors with response body, provider, and model
Raw 'json_error: no field X' now includes truncated response body,
provider name, and model ID for debugging context.
2026-04-07 14:22:05 +09:00
Yeachan-Heo
d94d792a48 Expose actionable ids for opaque provider failures
Issue #22 was triggered by generic upstream fatal wrappers that only surfaced 'Something went wrong', which left repeated Jobdori-style failures opaque in the CLI. Capture provider request ids on error responses, classify the known generic wrapper as provider_internal, and prefix the user-visible runtime error with the failure class plus session/trace identifiers so operators can correlate the failure quickly.

Constraint: Keep the fix small and user-safe without redesigning the broader runtime error taxonomy
Constraint: Preserve existing non-generic error text unless the wrapper is the known opaque fatal surface
Rejected: Broadly rewriting every runtime error into classified envelopes | unnecessary scope expansion for issue #22
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: If more opaque wrappers appear, extend the marker list and classification helper rather than reintroducing raw wrapper text alone
Tested: cargo test -p api detects_generic_fatal_wrapper_and_classifies_it_as_provider_internal -- --nocapture; cargo test -p api retries_exhausted_preserves_nested_request_id_and_failure_class -- --nocapture; cargo test -p rusty-claude-cli opaque_provider_wrapper_surfaces_failure_class_session_and_trace -- --nocapture; cargo test -p rusty-claude-cli retry_exhaustion_preserves_internal_failure_class_for_generic_provider_wrapper -- --nocapture; cargo test --workspace
Not-tested: Live upstream reproduction of the Jobdori failure against a real provider session
2026-04-06 00:30:28 +00:00
Yeachan-Heo
fa72cd665e Block oversized requests before providers hard-fail
The runtime already tracked rough token estimates for compaction, but provider-bound
requests still relied on naive model output limits and could be sent upstream even
when the selected model could not fit the estimated prompt plus requested output.

This adds a small model token/context registry in the API layer, estimates request
size from the serialized prompt payload, and fails locally with a dedicated
context-window error before Anthropic or xAI calls are made. Focused integration
coverage asserts the preflight fires before any HTTP request leaves the process.

Constraint: Keep the first pass minimal and reusable across both Anthropic and OpenAI-compatible providers
Rejected: Auto-compact-and-retry in the same patch | broader control-flow change than the requested minimal preflight
Confidence: medium
Scope-risk: narrow
Reversibility: clean
Directive: Expand the model registry before enabling preflight for additional providers or aliases
Tested: cargo build -p api -p tools -p rusty-claude-cli; cargo test -p api
Not-tested: End-to-end CLI auto-compaction or retry behavior after a local context_window_blocked failure
2026-04-05 16:39:58 +00:00
Yeachan-Heo
5f1eddf03a Preserve usage accounting on OpenAI SSE streams
OpenAI chat-completions streams can emit a final usage chunk when the\nclient opts in, but the Rust transport was not requesting it. This\nkeeps provider config on the client and adds stream_options.include_usage\nonly for OpenAI streams so normalized message_delta usage reflects the\ntransport without changing xAI request bodies.\n\nConstraint: Keep xAI request bodies unchanged because provider-specific streaming knobs may differ\nRejected: Enable stream_options for every OpenAI-compatible provider | risks sending unsupported params to xAI-style endpoints\nConfidence: high\nScope-risk: narrow\nDirective: Keep provider-specific streaming flags tied to OpenAiCompatConfig instead of inferring provider behavior from URLs\nTested: cargo clippy -p api --tests -- -D warnings\nTested: cargo test -p api openai_streaming_requests -- --nocapture\nTested: cargo test -p api xai_streaming_requests_skip_openai_specific_usage_opt_in -- --nocapture\nTested: cargo test -p api request_translation_uses_openai_compatible_shape -- --nocapture\nTested: cargo test -p api stream_message_normalizes_text_and_multiple_tool_calls -- --exact --nocapture\nNot-tested: Live OpenAI or xAI network calls
2026-04-02 10:04:14 +00:00
Yeachan-Heo
79da7c0adf Make claw's REPL feel self-explanatory from analysis through commit
Claw already had the core slash-command and git primitives, but the UX
still made users work to discover them, understand current workspace
state, and trust what `/commit` was about to do. This change tightens
that flow in the same places Codex-style CLIs do: command discovery,
live status, typo recovery, and commit preflight/output.

The REPL banner and `/help` now surface a clearer starter path, unknown
slash commands suggest likely matches, `/status` includes actionable git
state, and `/commit` explains what it is staging and committing before
and after the model writes the Lore message. I also cleared the
workspace's existing clippy blockers so the verification lane can stay
fully green.

Constraint: Improve UX inside the existing Rust CLI surfaces without adding new dependencies
Rejected: Add more slash commands first | discoverability and feedback were the bigger friction points
Rejected: Split verification lint fixes into a second commit | user requested one solid commit
Confidence: high
Scope-risk: moderate
Directive: Keep slash discoverability, status reporting, and commit reporting aligned so `/help`, `/status`, and `/commit` tell the same workflow story
Tested: cargo fmt --all; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: Manual interactive REPL session against live Anthropic/xAI endpoints
2026-04-02 07:20:35 +00:00
Yeachan-Heo
dcca64d1bd wip: grok provider abstraction 2026-04-01 06:00:48 +00:00
Yeachan-Heo
5654efb7b2 feat: provider abstraction layer + Grok API support 2026-04-01 04:10:46 +00:00