mirror of
https://github.com/instructkr/claw-code.git
synced 2026-04-18 04:05:17 +08:00
US-010: Add model compatibility documentation
Created comprehensive MODEL_COMPATIBILITY.md documenting: - Kimi models is_error exclusion (prevents 400 Bad Request) - Reasoning models tuning parameter stripping (o1, o3, o4, grok-3-mini, qwen-qwq) - GPT-5 max_completion_tokens requirement - Qwen model routing through DashScope Includes implementation details, key functions table, guide for adding new models, and testing commands. Cross-referenced with existing code comments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
236
docs/MODEL_COMPATIBILITY.md
Normal file
236
docs/MODEL_COMPATIBILITY.md
Normal file
@@ -0,0 +1,236 @@
|
|||||||
|
# Model Compatibility Guide
|
||||||
|
|
||||||
|
This document describes model-specific handling in the OpenAI-compatible provider. When adding new models or providers, review this guide to ensure proper compatibility.
|
||||||
|
|
||||||
|
## Table of Contents
|
||||||
|
|
||||||
|
- [Overview](#overview)
|
||||||
|
- [Model-Specific Handling](#model-specific-handling)
|
||||||
|
- [Kimi Models (is_error Exclusion)](#kimi-models-is_error-exclusion)
|
||||||
|
- [Reasoning Models (Tuning Parameter Stripping)](#reasoning-models-tuning-parameter-stripping)
|
||||||
|
- [GPT-5 (max_completion_tokens)](#gpt-5-max_completion_tokens)
|
||||||
|
- [Qwen Models (DashScope Routing)](#qwen-models-dashscope-routing)
|
||||||
|
- [Implementation Details](#implementation-details)
|
||||||
|
- [Adding New Models](#adding-new-models)
|
||||||
|
- [Testing](#testing)
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The `openai_compat.rs` provider translates Claude Code's internal message format to OpenAI-compatible chat completion requests. Different models have varying requirements for:
|
||||||
|
|
||||||
|
- Tool result message fields (`is_error`)
|
||||||
|
- Sampling parameters (temperature, top_p, etc.)
|
||||||
|
- Token limit fields (`max_tokens` vs `max_completion_tokens`)
|
||||||
|
- Base URL routing
|
||||||
|
|
||||||
|
## Model-Specific Handling
|
||||||
|
|
||||||
|
### Kimi Models (is_error Exclusion)
|
||||||
|
|
||||||
|
**Affected models:** `kimi-k2.5`, `kimi-k1.5`, `kimi-moonshot`, and any model with `kimi` in the name (case-insensitive)
|
||||||
|
|
||||||
|
**Behavior:** The `is_error` field is **excluded** from tool result messages.
|
||||||
|
|
||||||
|
**Rationale:** Kimi models (via Moonshot AI and DashScope) reject the `is_error` field with a 400 Bad Request error:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"error": {
|
||||||
|
"type": "invalid_request_error",
|
||||||
|
"message": "Unknown field: is_error"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Detection:**
|
||||||
|
```rust
|
||||||
|
fn model_rejects_is_error_field(model: &str) -> bool {
|
||||||
|
let lowered = model.to_ascii_lowercase();
|
||||||
|
let canonical = lowered.rsplit('/').next().unwrap_or(lowered.as_str());
|
||||||
|
canonical.starts_with("kimi-")
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Testing:** See `model_rejects_is_error_field_detects_kimi_models` and related tests in `openai_compat.rs`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Reasoning Models (Tuning Parameter Stripping)
|
||||||
|
|
||||||
|
**Affected models:**
|
||||||
|
- OpenAI: `o1`, `o1-*`, `o3`, `o3-*`, `o4`, `o4-*`
|
||||||
|
- xAI: `grok-3-mini`
|
||||||
|
- Alibaba DashScope: `qwen-qwq-*`, `qwq-*`, `qwen3-*-thinking`
|
||||||
|
|
||||||
|
**Behavior:** The following tuning parameters are **stripped** from requests:
|
||||||
|
- `temperature`
|
||||||
|
- `top_p`
|
||||||
|
- `frequency_penalty`
|
||||||
|
- `presence_penalty`
|
||||||
|
|
||||||
|
**Rationale:** Reasoning/chain-of-thought models use fixed sampling strategies and reject these parameters with 400 errors.
|
||||||
|
|
||||||
|
**Exception:** `reasoning_effort` is included for compatible models when explicitly set.
|
||||||
|
|
||||||
|
**Detection:**
|
||||||
|
```rust
|
||||||
|
fn is_reasoning_model(model: &str) -> bool {
|
||||||
|
let canonical = model.to_ascii_lowercase()
|
||||||
|
.rsplit('/')
|
||||||
|
.next()
|
||||||
|
.unwrap_or(model);
|
||||||
|
canonical.starts_with("o1")
|
||||||
|
|| canonical.starts_with("o3")
|
||||||
|
|| canonical.starts_with("o4")
|
||||||
|
|| canonical == "grok-3-mini"
|
||||||
|
|| canonical.starts_with("qwen-qwq")
|
||||||
|
|| canonical.starts_with("qwq")
|
||||||
|
|| (canonical.starts_with("qwen3") && canonical.contains("-thinking"))
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Testing:** See `reasoning_model_strips_tuning_params`, `grok_3_mini_is_reasoning_model`, and `qwen_reasoning_variants_are_detected` tests.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### GPT-5 (max_completion_tokens)
|
||||||
|
|
||||||
|
**Affected models:** All models starting with `gpt-5`
|
||||||
|
|
||||||
|
**Behavior:** Uses `max_completion_tokens` instead of `max_tokens` in the request payload.
|
||||||
|
|
||||||
|
**Rationale:** GPT-5 models require the `max_completion_tokens` field. Legacy `max_tokens` causes request validation failures:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"error": {
|
||||||
|
"message": "Unknown field: max_tokens"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Implementation:**
|
||||||
|
```rust
|
||||||
|
let max_tokens_key = if wire_model.starts_with("gpt-5") {
|
||||||
|
"max_completion_tokens"
|
||||||
|
} else {
|
||||||
|
"max_tokens"
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
**Testing:** See `gpt5_uses_max_completion_tokens_not_max_tokens` and `non_gpt5_uses_max_tokens` tests.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Qwen Models (DashScope Routing)
|
||||||
|
|
||||||
|
**Affected models:** All models with `qwen` prefix
|
||||||
|
|
||||||
|
**Behavior:** Routed to DashScope (`https://dashscope.aliyuncs.com/compatible-mode/v1`) rather than default providers.
|
||||||
|
|
||||||
|
**Rationale:** Qwen models are hosted by Alibaba Cloud's DashScope service, not OpenAI or Anthropic.
|
||||||
|
|
||||||
|
**Configuration:**
|
||||||
|
```rust
|
||||||
|
pub const DEFAULT_DASHSCOPE_BASE_URL: &str = "https://dashscope.aliyuncs.com/compatible-mode/v1";
|
||||||
|
```
|
||||||
|
|
||||||
|
**Authentication:** Uses `DASHSCOPE_API_KEY` environment variable.
|
||||||
|
|
||||||
|
**Note:** Some Qwen models are also reasoning models (see [Reasoning Models](#reasoning-models-tuning-parameter-stripping) above) and receive both treatments.
|
||||||
|
|
||||||
|
## Implementation Details
|
||||||
|
|
||||||
|
### File Location
|
||||||
|
All model-specific logic is in:
|
||||||
|
```
|
||||||
|
rust/crates/api/src/providers/openai_compat.rs
|
||||||
|
```
|
||||||
|
|
||||||
|
### Key Functions
|
||||||
|
|
||||||
|
| Function | Purpose |
|
||||||
|
|----------|---------|
|
||||||
|
| `model_rejects_is_error_field()` | Detects models that don't support `is_error` in tool results |
|
||||||
|
| `is_reasoning_model()` | Detects reasoning models that need tuning param stripping |
|
||||||
|
| `translate_message()` | Converts internal messages to OpenAI format (applies `is_error` logic) |
|
||||||
|
| `build_chat_completion_request()` | Constructs full request payload (applies all model-specific logic) |
|
||||||
|
|
||||||
|
### Provider Prefix Handling
|
||||||
|
|
||||||
|
All model detection functions strip provider prefixes (e.g., `dashscope/kimi-k2.5` → `kimi-k2.5`) before matching:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let canonical = model.to_ascii_lowercase()
|
||||||
|
.rsplit('/')
|
||||||
|
.next()
|
||||||
|
.unwrap_or(model);
|
||||||
|
```
|
||||||
|
|
||||||
|
This ensures consistent detection regardless of whether models are referenced with or without provider prefixes.
|
||||||
|
|
||||||
|
## Adding New Models
|
||||||
|
|
||||||
|
When adding support for new models:
|
||||||
|
|
||||||
|
1. **Check if the model is a reasoning model**
|
||||||
|
- Does it reject temperature/top_p parameters?
|
||||||
|
- Add to `is_reasoning_model()` detection
|
||||||
|
|
||||||
|
2. **Check tool result compatibility**
|
||||||
|
- Does it reject the `is_error` field?
|
||||||
|
- Add to `model_rejects_is_error_field()` detection
|
||||||
|
|
||||||
|
3. **Check token limit field**
|
||||||
|
- Does it require `max_completion_tokens` instead of `max_tokens`?
|
||||||
|
- Update the `max_tokens_key` logic
|
||||||
|
|
||||||
|
4. **Add tests**
|
||||||
|
- Unit test for detection function
|
||||||
|
- Integration test in `build_chat_completion_request`
|
||||||
|
|
||||||
|
5. **Update this documentation**
|
||||||
|
- Add the model to the affected lists
|
||||||
|
- Document any special behavior
|
||||||
|
|
||||||
|
## Testing
|
||||||
|
|
||||||
|
### Running Model-Specific Tests
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# All OpenAI compatibility tests
|
||||||
|
cargo test --package api providers::openai_compat
|
||||||
|
|
||||||
|
# Specific test categories
|
||||||
|
cargo test --package api model_rejects_is_error_field
|
||||||
|
cargo test --package api reasoning_model
|
||||||
|
cargo test --package api gpt5
|
||||||
|
cargo test --package api qwen
|
||||||
|
```
|
||||||
|
|
||||||
|
### Test Files
|
||||||
|
|
||||||
|
- Unit tests: `rust/crates/api/src/providers/openai_compat.rs` (in `mod tests`)
|
||||||
|
- Integration tests: `rust/crates/api/tests/openai_compat_integration.rs`
|
||||||
|
|
||||||
|
### Verifying Model Detection
|
||||||
|
|
||||||
|
To verify a model is detected correctly without making API calls:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[test]
|
||||||
|
fn my_new_model_is_detected() {
|
||||||
|
// is_error handling
|
||||||
|
assert!(model_rejects_is_error_field("my-model"));
|
||||||
|
|
||||||
|
// Reasoning model detection
|
||||||
|
assert!(is_reasoning_model("my-model"));
|
||||||
|
|
||||||
|
// Provider prefix handling
|
||||||
|
assert!(model_rejects_is_error_field("provider/my-model"));
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Last updated: 2026-04-16*
|
||||||
|
|
||||||
|
For questions or updates, see the implementation in `rust/crates/api/src/providers/openai_compat.rs`.
|
||||||
2
prd.json
2
prd.json
@@ -144,7 +144,7 @@
|
|||||||
"Document qwen model routing through dashscope",
|
"Document qwen model routing through dashscope",
|
||||||
"Cross-reference with existing code comments"
|
"Cross-reference with existing code comments"
|
||||||
],
|
],
|
||||||
"passes": false,
|
"passes": true,
|
||||||
"priority": "P2"
|
"priority": "P2"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
|||||||
13
progress.txt
13
progress.txt
@@ -94,3 +94,16 @@ US-009 COMPLETED (Add unit tests for kimi model compatibility fix)
|
|||||||
4. build_chat_completion_request_kimi_vs_non_kimi_tool_results - full integration test for request building
|
4. build_chat_completion_request_kimi_vs_non_kimi_tool_results - full integration test for request building
|
||||||
- Tests: 4 new tests, 119 unit tests total in api crate (+4), all passing
|
- Tests: 4 new tests, 119 unit tests total in api crate (+4), all passing
|
||||||
- Integration tests: 29 passing (no regressions)
|
- Integration tests: 29 passing (no regressions)
|
||||||
|
|
||||||
|
US-010 COMPLETED (Add model compatibility documentation)
|
||||||
|
- Files: docs/MODEL_COMPATIBILITY.md
|
||||||
|
- Created comprehensive documentation covering:
|
||||||
|
1. Kimi Models (is_error Exclusion) - documents the 400 Bad Request issue and solution
|
||||||
|
2. Reasoning Models (Tuning Parameter Stripping) - covers o1, o3, o4, grok-3-mini, qwen-qwq, qwen3-thinking
|
||||||
|
3. GPT-5 (max_completion_tokens) - documents max_tokens vs max_completion_tokens requirement
|
||||||
|
4. Qwen Models (DashScope Routing) - explains routing and authentication
|
||||||
|
- Added implementation details section with key functions
|
||||||
|
- Added "Adding New Models" guide for future contributors
|
||||||
|
- Added testing section with example commands
|
||||||
|
- Cross-referenced with existing code comments in openai_compat.rs
|
||||||
|
- cargo clippy passes
|
||||||
|
|||||||
Reference in New Issue
Block a user