Files
claw-code/docs/MODEL_COMPATIBILITY.md
Yeachan-Heo f65d15fb2f US-010: Add model compatibility documentation
Created comprehensive MODEL_COMPATIBILITY.md documenting:
- Kimi models is_error exclusion (prevents 400 Bad Request)
- Reasoning models tuning parameter stripping (o1, o3, o4, grok-3-mini, qwen-qwq)
- GPT-5 max_completion_tokens requirement
- Qwen model routing through DashScope

Includes implementation details, key functions table, guide for adding new
models, and testing commands. Cross-referenced with existing code comments.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 10:55:58 +00:00

7.0 KiB

Model Compatibility Guide

This document describes model-specific handling in the OpenAI-compatible provider. When adding new models or providers, review this guide to ensure proper compatibility.

Table of Contents

Overview

The openai_compat.rs provider translates Claude Code's internal message format to OpenAI-compatible chat completion requests. Different models have varying requirements for:

  • Tool result message fields (is_error)
  • Sampling parameters (temperature, top_p, etc.)
  • Token limit fields (max_tokens vs max_completion_tokens)
  • Base URL routing

Model-Specific Handling

Kimi Models (is_error Exclusion)

Affected models: kimi-k2.5, kimi-k1.5, kimi-moonshot, and any model with kimi in the name (case-insensitive)

Behavior: The is_error field is excluded from tool result messages.

Rationale: Kimi models (via Moonshot AI and DashScope) reject the is_error field with a 400 Bad Request error:

{
  "error": {
    "type": "invalid_request_error",
    "message": "Unknown field: is_error"
  }
}

Detection:

fn model_rejects_is_error_field(model: &str) -> bool {
    let lowered = model.to_ascii_lowercase();
    let canonical = lowered.rsplit('/').next().unwrap_or(lowered.as_str());
    canonical.starts_with("kimi-")
}

Testing: See model_rejects_is_error_field_detects_kimi_models and related tests in openai_compat.rs.


Reasoning Models (Tuning Parameter Stripping)

Affected models:

  • OpenAI: o1, o1-*, o3, o3-*, o4, o4-*
  • xAI: grok-3-mini
  • Alibaba DashScope: qwen-qwq-*, qwq-*, qwen3-*-thinking

Behavior: The following tuning parameters are stripped from requests:

  • temperature
  • top_p
  • frequency_penalty
  • presence_penalty

Rationale: Reasoning/chain-of-thought models use fixed sampling strategies and reject these parameters with 400 errors.

Exception: reasoning_effort is included for compatible models when explicitly set.

Detection:

fn is_reasoning_model(model: &str) -> bool {
    let canonical = model.to_ascii_lowercase()
        .rsplit('/')
        .next()
        .unwrap_or(model);
    canonical.starts_with("o1")
        || canonical.starts_with("o3")
        || canonical.starts_with("o4")
        || canonical == "grok-3-mini"
        || canonical.starts_with("qwen-qwq")
        || canonical.starts_with("qwq")
        || (canonical.starts_with("qwen3") && canonical.contains("-thinking"))
}

Testing: See reasoning_model_strips_tuning_params, grok_3_mini_is_reasoning_model, and qwen_reasoning_variants_are_detected tests.


GPT-5 (max_completion_tokens)

Affected models: All models starting with gpt-5

Behavior: Uses max_completion_tokens instead of max_tokens in the request payload.

Rationale: GPT-5 models require the max_completion_tokens field. Legacy max_tokens causes request validation failures:

{
  "error": {
    "message": "Unknown field: max_tokens"
  }
}

Implementation:

let max_tokens_key = if wire_model.starts_with("gpt-5") {
    "max_completion_tokens"
} else {
    "max_tokens"
};

Testing: See gpt5_uses_max_completion_tokens_not_max_tokens and non_gpt5_uses_max_tokens tests.


Qwen Models (DashScope Routing)

Affected models: All models with qwen prefix

Behavior: Routed to DashScope (https://dashscope.aliyuncs.com/compatible-mode/v1) rather than default providers.

Rationale: Qwen models are hosted by Alibaba Cloud's DashScope service, not OpenAI or Anthropic.

Configuration:

pub const DEFAULT_DASHSCOPE_BASE_URL: &str = "https://dashscope.aliyuncs.com/compatible-mode/v1";

Authentication: Uses DASHSCOPE_API_KEY environment variable.

Note: Some Qwen models are also reasoning models (see Reasoning Models above) and receive both treatments.

Implementation Details

File Location

All model-specific logic is in:

rust/crates/api/src/providers/openai_compat.rs

Key Functions

Function Purpose
model_rejects_is_error_field() Detects models that don't support is_error in tool results
is_reasoning_model() Detects reasoning models that need tuning param stripping
translate_message() Converts internal messages to OpenAI format (applies is_error logic)
build_chat_completion_request() Constructs full request payload (applies all model-specific logic)

Provider Prefix Handling

All model detection functions strip provider prefixes (e.g., dashscope/kimi-k2.5kimi-k2.5) before matching:

let canonical = model.to_ascii_lowercase()
    .rsplit('/')
    .next()
    .unwrap_or(model);

This ensures consistent detection regardless of whether models are referenced with or without provider prefixes.

Adding New Models

When adding support for new models:

  1. Check if the model is a reasoning model

    • Does it reject temperature/top_p parameters?
    • Add to is_reasoning_model() detection
  2. Check tool result compatibility

    • Does it reject the is_error field?
    • Add to model_rejects_is_error_field() detection
  3. Check token limit field

    • Does it require max_completion_tokens instead of max_tokens?
    • Update the max_tokens_key logic
  4. Add tests

    • Unit test for detection function
    • Integration test in build_chat_completion_request
  5. Update this documentation

    • Add the model to the affected lists
    • Document any special behavior

Testing

Running Model-Specific Tests

# All OpenAI compatibility tests
cargo test --package api providers::openai_compat

# Specific test categories
cargo test --package api model_rejects_is_error_field
cargo test --package api reasoning_model
cargo test --package api gpt5
cargo test --package api qwen

Test Files

  • Unit tests: rust/crates/api/src/providers/openai_compat.rs (in mod tests)
  • Integration tests: rust/crates/api/tests/openai_compat_integration.rs

Verifying Model Detection

To verify a model is detected correctly without making API calls:

#[test]
fn my_new_model_is_detected() {
    // is_error handling
    assert!(model_rejects_is_error_field("my-model"));
    
    // Reasoning model detection
    assert!(is_reasoning_model("my-model"));
    
    // Provider prefix handling
    assert!(model_rejects_is_error_field("provider/my-model"));
}

Last updated: 2026-04-16

For questions or updates, see the implementation in rust/crates/api/src/providers/openai_compat.rs.