Created comprehensive MODEL_COMPATIBILITY.md documenting: - Kimi models is_error exclusion (prevents 400 Bad Request) - Reasoning models tuning parameter stripping (o1, o3, o4, grok-3-mini, qwen-qwq) - GPT-5 max_completion_tokens requirement - Qwen model routing through DashScope Includes implementation details, key functions table, guide for adding new models, and testing commands. Cross-referenced with existing code comments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
7.0 KiB
Model Compatibility Guide
This document describes model-specific handling in the OpenAI-compatible provider. When adding new models or providers, review this guide to ensure proper compatibility.
Table of Contents
Overview
The openai_compat.rs provider translates Claude Code's internal message format to OpenAI-compatible chat completion requests. Different models have varying requirements for:
- Tool result message fields (
is_error) - Sampling parameters (temperature, top_p, etc.)
- Token limit fields (
max_tokensvsmax_completion_tokens) - Base URL routing
Model-Specific Handling
Kimi Models (is_error Exclusion)
Affected models: kimi-k2.5, kimi-k1.5, kimi-moonshot, and any model with kimi in the name (case-insensitive)
Behavior: The is_error field is excluded from tool result messages.
Rationale: Kimi models (via Moonshot AI and DashScope) reject the is_error field with a 400 Bad Request error:
{
"error": {
"type": "invalid_request_error",
"message": "Unknown field: is_error"
}
}
Detection:
fn model_rejects_is_error_field(model: &str) -> bool {
let lowered = model.to_ascii_lowercase();
let canonical = lowered.rsplit('/').next().unwrap_or(lowered.as_str());
canonical.starts_with("kimi-")
}
Testing: See model_rejects_is_error_field_detects_kimi_models and related tests in openai_compat.rs.
Reasoning Models (Tuning Parameter Stripping)
Affected models:
- OpenAI:
o1,o1-*,o3,o3-*,o4,o4-* - xAI:
grok-3-mini - Alibaba DashScope:
qwen-qwq-*,qwq-*,qwen3-*-thinking
Behavior: The following tuning parameters are stripped from requests:
temperaturetop_pfrequency_penaltypresence_penalty
Rationale: Reasoning/chain-of-thought models use fixed sampling strategies and reject these parameters with 400 errors.
Exception: reasoning_effort is included for compatible models when explicitly set.
Detection:
fn is_reasoning_model(model: &str) -> bool {
let canonical = model.to_ascii_lowercase()
.rsplit('/')
.next()
.unwrap_or(model);
canonical.starts_with("o1")
|| canonical.starts_with("o3")
|| canonical.starts_with("o4")
|| canonical == "grok-3-mini"
|| canonical.starts_with("qwen-qwq")
|| canonical.starts_with("qwq")
|| (canonical.starts_with("qwen3") && canonical.contains("-thinking"))
}
Testing: See reasoning_model_strips_tuning_params, grok_3_mini_is_reasoning_model, and qwen_reasoning_variants_are_detected tests.
GPT-5 (max_completion_tokens)
Affected models: All models starting with gpt-5
Behavior: Uses max_completion_tokens instead of max_tokens in the request payload.
Rationale: GPT-5 models require the max_completion_tokens field. Legacy max_tokens causes request validation failures:
{
"error": {
"message": "Unknown field: max_tokens"
}
}
Implementation:
let max_tokens_key = if wire_model.starts_with("gpt-5") {
"max_completion_tokens"
} else {
"max_tokens"
};
Testing: See gpt5_uses_max_completion_tokens_not_max_tokens and non_gpt5_uses_max_tokens tests.
Qwen Models (DashScope Routing)
Affected models: All models with qwen prefix
Behavior: Routed to DashScope (https://dashscope.aliyuncs.com/compatible-mode/v1) rather than default providers.
Rationale: Qwen models are hosted by Alibaba Cloud's DashScope service, not OpenAI or Anthropic.
Configuration:
pub const DEFAULT_DASHSCOPE_BASE_URL: &str = "https://dashscope.aliyuncs.com/compatible-mode/v1";
Authentication: Uses DASHSCOPE_API_KEY environment variable.
Note: Some Qwen models are also reasoning models (see Reasoning Models above) and receive both treatments.
Implementation Details
File Location
All model-specific logic is in:
rust/crates/api/src/providers/openai_compat.rs
Key Functions
| Function | Purpose |
|---|---|
model_rejects_is_error_field() |
Detects models that don't support is_error in tool results |
is_reasoning_model() |
Detects reasoning models that need tuning param stripping |
translate_message() |
Converts internal messages to OpenAI format (applies is_error logic) |
build_chat_completion_request() |
Constructs full request payload (applies all model-specific logic) |
Provider Prefix Handling
All model detection functions strip provider prefixes (e.g., dashscope/kimi-k2.5 → kimi-k2.5) before matching:
let canonical = model.to_ascii_lowercase()
.rsplit('/')
.next()
.unwrap_or(model);
This ensures consistent detection regardless of whether models are referenced with or without provider prefixes.
Adding New Models
When adding support for new models:
-
Check if the model is a reasoning model
- Does it reject temperature/top_p parameters?
- Add to
is_reasoning_model()detection
-
Check tool result compatibility
- Does it reject the
is_errorfield? - Add to
model_rejects_is_error_field()detection
- Does it reject the
-
Check token limit field
- Does it require
max_completion_tokensinstead ofmax_tokens? - Update the
max_tokens_keylogic
- Does it require
-
Add tests
- Unit test for detection function
- Integration test in
build_chat_completion_request
-
Update this documentation
- Add the model to the affected lists
- Document any special behavior
Testing
Running Model-Specific Tests
# All OpenAI compatibility tests
cargo test --package api providers::openai_compat
# Specific test categories
cargo test --package api model_rejects_is_error_field
cargo test --package api reasoning_model
cargo test --package api gpt5
cargo test --package api qwen
Test Files
- Unit tests:
rust/crates/api/src/providers/openai_compat.rs(inmod tests) - Integration tests:
rust/crates/api/tests/openai_compat_integration.rs
Verifying Model Detection
To verify a model is detected correctly without making API calls:
#[test]
fn my_new_model_is_detected() {
// is_error handling
assert!(model_rejects_is_error_field("my-model"));
// Reasoning model detection
assert!(is_reasoning_model("my-model"));
// Provider prefix handling
assert!(model_rejects_is_error_field("provider/my-model"));
}
Last updated: 2026-04-16
For questions or updates, see the implementation in rust/crates/api/src/providers/openai_compat.rs.