mirror of https://github.com/instructkr/claw-code.git synced 2026-04-18 02:15:19 +08:00

Files

Yeachan-Heo f65d15fb2f US-010: Add model compatibility documentation

Created comprehensive MODEL_COMPATIBILITY.md documenting:
- Kimi models is_error exclusion (prevents 400 Bad Request)
- Reasoning models tuning parameter stripping (o1, o3, o4, grok-3-mini, qwen-qwq)
- GPT-5 max_completion_tokens requirement
- Qwen model routing through DashScope

Includes implementation details, key functions table, guide for adding new
models, and testing commands. Cross-referenced with existing code comments.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-04-16 10:55:58 +00:00

7.0 KiB

Raw Blame History

Model Compatibility Guide

This document describes model-specific handling in the OpenAI-compatible provider. When adding new models or providers, review this guide to ensure proper compatibility.

Overview
Model-Specific Handling
Implementation Details
Adding New Models
Testing

Overview

The openai_compat.rs provider translates Claude Code's internal message format to OpenAI-compatible chat completion requests. Different models have varying requirements for:

Tool result message fields (is_error)
Sampling parameters (temperature, top_p, etc.)
Token limit fields (max_tokens vs max_completion_tokens)
Base URL routing

Model-Specific Handling

Kimi Models (is_error Exclusion)

Affected models: kimi-k2.5, kimi-k1.5, kimi-moonshot, and any model with kimi in the name (case-insensitive)

Behavior: The is_error field is excluded from tool result messages.

Rationale: Kimi models (via Moonshot AI and DashScope) reject the is_error field with a 400 Bad Request error:

{
  "error": {
    "type": "invalid_request_error",
    "message": "Unknown field: is_error"
  }
}

Detection:

fn model_rejects_is_error_field(model: &str) -> bool {
    let lowered = model.to_ascii_lowercase();
    let canonical = lowered.rsplit('/').next().unwrap_or(lowered.as_str());
    canonical.starts_with("kimi-")
}

Testing: See model_rejects_is_error_field_detects_kimi_models and related tests in openai_compat.rs.

Reasoning Models (Tuning Parameter Stripping)

Affected models:

OpenAI: o1, o1-*, o3, o3-*, o4, o4-*
xAI: grok-3-mini
Alibaba DashScope: qwen-qwq-*, qwq-*, qwen3-*-thinking

Behavior: The following tuning parameters are stripped from requests:

temperature
top_p
frequency_penalty
presence_penalty

Rationale: Reasoning/chain-of-thought models use fixed sampling strategies and reject these parameters with 400 errors.

Exception: reasoning_effort is included for compatible models when explicitly set.

Detection:

fn is_reasoning_model(model: &str) -> bool {
    let canonical = model.to_ascii_lowercase()
        .rsplit('/')
        .next()
        .unwrap_or(model);
    canonical.starts_with("o1")
        || canonical.starts_with("o3")
        || canonical.starts_with("o4")
        || canonical == "grok-3-mini"
        || canonical.starts_with("qwen-qwq")
        || canonical.starts_with("qwq")
        || (canonical.starts_with("qwen3") && canonical.contains("-thinking"))
}

Testing: See reasoning_model_strips_tuning_params, grok_3_mini_is_reasoning_model, and qwen_reasoning_variants_are_detected tests.

GPT-5 (max_completion_tokens)

Affected models: All models starting with gpt-5

Behavior: Uses max_completion_tokens instead of max_tokens in the request payload.

Rationale: GPT-5 models require the max_completion_tokens field. Legacy max_tokens causes request validation failures:

{
  "error": {
    "message": "Unknown field: max_tokens"
  }
}

Implementation:

let max_tokens_key = if wire_model.starts_with("gpt-5") {
    "max_completion_tokens"
} else {
    "max_tokens"
};

Testing: See gpt5_uses_max_completion_tokens_not_max_tokens and non_gpt5_uses_max_tokens tests.

Qwen Models (DashScope Routing)

Affected models: All models with qwen prefix

Behavior: Routed to DashScope (https://dashscope.aliyuncs.com/compatible-mode/v1) rather than default providers.

Rationale: Qwen models are hosted by Alibaba Cloud's DashScope service, not OpenAI or Anthropic.

Configuration:

pub const DEFAULT_DASHSCOPE_BASE_URL: &str = "https://dashscope.aliyuncs.com/compatible-mode/v1";

Authentication: Uses DASHSCOPE_API_KEY environment variable.

Note: Some Qwen models are also reasoning models (see Reasoning Models above) and receive both treatments.

Implementation Details

File Location

All model-specific logic is in:

rust/crates/api/src/providers/openai_compat.rs

Key Functions

Function	Purpose
`model_rejects_is_error_field()`	Detects models that don't support `is_error` in tool results
`is_reasoning_model()`	Detects reasoning models that need tuning param stripping
`translate_message()`	Converts internal messages to OpenAI format (applies `is_error` logic)
`build_chat_completion_request()`	Constructs full request payload (applies all model-specific logic)

Provider Prefix Handling

All model detection functions strip provider prefixes (e.g., dashscope/kimi-k2.5 → kimi-k2.5) before matching:

let canonical = model.to_ascii_lowercase()
    .rsplit('/')
    .next()
    .unwrap_or(model);

This ensures consistent detection regardless of whether models are referenced with or without provider prefixes.

Adding New Models

When adding support for new models:

Check if the model is a reasoning model
- Does it reject temperature/top_p parameters?
- Add to is_reasoning_model() detection
Check tool result compatibility
- Does it reject the is_error field?
- Add to model_rejects_is_error_field() detection
Check token limit field
- Does it require max_completion_tokens instead of max_tokens?
- Update the max_tokens_key logic
Add tests
- Unit test for detection function
- Integration test in build_chat_completion_request
Update this documentation
- Add the model to the affected lists
- Document any special behavior

Testing

Running Model-Specific Tests

# All OpenAI compatibility tests
cargo test --package api providers::openai_compat

# Specific test categories
cargo test --package api model_rejects_is_error_field
cargo test --package api reasoning_model
cargo test --package api gpt5
cargo test --package api qwen

Test Files

Unit tests: rust/crates/api/src/providers/openai_compat.rs (in mod tests)
Integration tests: rust/crates/api/tests/openai_compat_integration.rs

Verifying Model Detection

To verify a model is detected correctly without making API calls:

#[test]
fn my_new_model_is_detected() {
    // is_error handling
    assert!(model_rejects_is_error_field("my-model"));
    
    // Reasoning model detection
    assert!(is_reasoning_model("my-model"));
    
    // Provider prefix handling
    assert!(model_rejects_is_error_field("provider/my-model"));
}

Last updated: 2026-04-16

For questions or updates, see the implementation in rust/crates/api/src/providers/openai_compat.rs.

7.0 KiB Raw Blame History