feat(adapters): add Kimi (Moonshot) HTTP adapter with thinking support by georgeharker · Pull Request #3087 · olimorris/codecompanion.nvim

georgeharker · 2026-05-03T11:25:32Z

Description

Add a dedicated kimi adapter for Moonshot's Kimi K2 family. Although Moonshot's API is OpenAI-compatible enough to work via openai_compatible for simple chats, the K2-thinking variants impose a strict round-trip requirement that breaks tool-calling chats:

When think is enabled, every assistant message carrying tool_calls
must also carry a reasoning_content field on replay. Omitting it
yields a 400 — "thinking is enabled but reasoning_content is missing in assistant tool call message at index N" — on the second turn of any
tool-using conversation.

OpenAI's Chat Completions schema has no notion of reasoning_content on the wire (it nests reasoning behind a reasoning object, populated by adapters such as Copilot for gemini-3 or DeepSeek's reasoner), so the generic OpenAI form_messages cannot satisfy Kimi's validator.

This adapter wires the round-trip end-to-end:

chat_output (delegated to OpenAI) already routes non-standard streaming delta fields through extra, so delta.reasoning_content chunks land in extra.reasoning_content for free.
parse_message_meta lifts those fragments onto data.output.reasoning.content, the same shape DeepSeek and Copilot use, so CC stores it as msg.reasoning on the assistant message.
form_messages post-processes OpenAI's output: it rewrites the nested m.reasoning into Moonshot's flat reasoning_content string on assistant messages, and inserts reasoning_content = "" for assistant tool-call messages that have no captured reasoning (chat history that pre-dates the adapter, edited messages, model swaps mid-conversation). The empty-string fallback satisfies the validator without fabricating reasoning content.

Other K2 quirks captured in the schema:

temperature defaults to 1 (kimi-k2-thinking rejects any other value).
top_p defaults to 0.95 (same — pinned by the model).
think schema field (boolean, default true) so the model is actually asked to reason; the adapter only produces a wire payload Kimi accepts when this is on for k2-thinking variants.

Models cover the current K2 line per https://platform.kimi.ai/docs/models — kimi-k2.6 (default), kimi-k2.5, kimi-k2-thinking{,-turbo}, kimi-k2-turbo-preview, kimi-k2-{0905,0711}-preview. Older moonshot-v1-* and vision-preview models are intentionally omitted because they don't support tool calling and would need extra setup-time gating. Schema follows the anthropic/openai static-choices convention.

Structure mirrors mistral.lua for review-friendliness: same top-level field order, same handler-key order with parse_message_meta slotted between chat_output and tools, same delegate-to-openai pattern for all standard handlers. The only kimi-specific handler bodies are the two reasoning-handling additions described above.

tests: cover Kimi (Moonshot) HTTP adapter

Mirrors the structure of test_mistral.lua: a top-level adapter set with a
pre_case hook that resolves the kimi adapter, then three nested groups —
form_messages, Streaming, and No Streaming — using the same hook shape and
test phrasing where behaviour overlaps.

Standard tests (mirrored from mistral):

form_messages: basic, with tools, and form_tools after extend()
Streaming: chat-buffer output (stream = true pre_case)
No Streaming: chat-buffer output, tools, inline assistant (stream = false
pre_case)

Kimi-specific additions cover the reasoning_content round-trip the adapter
exists for:

form_messages rewrites m.reasoning into a flat reasoning_content string
on assistant messages.
When think=true, an empty-string reasoning_content is inserted on
assistant tool-call messages with no captured reasoning, satisfying
Moonshot's validator on history that pre-dates the adapter.
When think=false, no fallback is inserted (negative case).
Streaming "can process thinking" walks chat_output → parse_message_meta
and asserts both content and reasoning aggregate correctly.

Stubs follow the OpenAI Chat-Completions wire format (the streaming stub
uses delta.reasoning_content; the tools-no-streaming stub includes a flat
reasoning_content string on the assistant message). Files added:

tests/adapters/http/test_kimi.lua
tests/adapters/http/stubs/kimi_streaming.txt
tests/adapters/http/stubs/kimi_no_streaming.txt
tests/adapters/http/stubs/kimi_tools_no_streaming.txt

docs: document the Kimi (Moonshot) HTTP adapter

Adds the kimi adapter to the supported-LLMs list (README, doc/index.md,
regenerated doc/codecompanion.txt) and a Setup Examples entry in
doc/configuration/adapters-http.md.

The setup example covers:

Minimal config (just MOONSHOT_API_KEY plus interactions.chat.adapter).
Overriding the API-key source via the cmd: prefix (1Password CLI
example) and switching the URL for region-specific endpoints
(e.g. api.moonshot.cn).
Schema overrides for model and think so users can pick a non-
thinking K2 model or disable thinking on the K2-thinking variants.
An IMPORTANT callout that kimi-k2-thinking pins temperature=1 and
top_p=0.95 server-side, matching the adapter's defaults.

The example is placed between the llama.cpp and Ollama sections — both
neighbours involve OpenAI-compatible reasoning configuration, which keeps
the page topically grouped.

AI Usage

Claude code to help determine correct format for adapters and wiring for the thinking_content, help generate tests

Related Issue(s)

N/a

Screenshots

N/a

Checklist

I've read the contributing guidelines and have adhered to them in this PR
I confirm that this PR has been majority created by me, and not AI (unless stated in the "AI Usage" section above)
I've run make all to ensure docs are generated, tests pass and StyLua has formatted the code
(optional) I've added test coverage for this fix/feature
(optional) I've updated the README and/or relevant docs pages

Add a dedicated `kimi` adapter for Moonshot's Kimi K2 family. Although Moonshot's API is OpenAI-compatible enough to work via `openai_compatible` for simple chats, the K2-thinking variants impose a strict round-trip requirement that breaks tool-calling chats: When `think` is enabled, every assistant message carrying `tool_calls` must also carry a `reasoning_content` field on replay. Omitting it yields a 400 — `"thinking is enabled but reasoning_content is missing in assistant tool call message at index N"` — on the second turn of any tool-using conversation. OpenAI's Chat Completions schema has no notion of `reasoning_content` on the wire (it nests reasoning behind a `reasoning` object, populated by adapters such as Copilot for gemini-3 or DeepSeek's reasoner), so the generic OpenAI form_messages cannot satisfy Kimi's validator. This adapter wires the round-trip end-to-end: - `chat_output` (delegated to OpenAI) already routes non-standard streaming delta fields through `extra`, so `delta.reasoning_content` chunks land in `extra.reasoning_content` for free. - `parse_message_meta` lifts those fragments onto `data.output.reasoning.content`, the same shape DeepSeek and Copilot use, so CC stores it as `msg.reasoning` on the assistant message. - `form_messages` post-processes OpenAI's output: it rewrites the nested `m.reasoning` into Moonshot's flat `reasoning_content` string on assistant messages, and inserts `reasoning_content = ""` for assistant tool-call messages that have no captured reasoning (chat history that pre-dates the adapter, edited messages, model swaps mid-conversation). The empty-string fallback satisfies the validator without fabricating reasoning content. Other K2 quirks captured in the schema: - `temperature` defaults to `1` (kimi-k2-thinking rejects any other value). - `top_p` defaults to `0.95` (same — pinned by the model). - `think` schema field (boolean, default `true`) so the model is actually asked to reason; the adapter only produces a wire payload Kimi accepts when this is on for k2-thinking variants. Models cover the current K2 line per https://platform.kimi.ai/docs/models — `kimi-k2.6` (default), `kimi-k2.5`, `kimi-k2-thinking{,-turbo}`, `kimi-k2-turbo-preview`, `kimi-k2-{0905,0711}-preview`. Older `moonshot-v1-*` and vision-preview models are intentionally omitted because they don't support tool calling and would need extra setup-time gating. Schema follows the anthropic/openai static-`choices` convention. Structure mirrors `mistral.lua` for review-friendliness: same top-level field order, same handler-key order with `parse_message_meta` slotted between `chat_output` and `tools`, same delegate-to-openai pattern for all standard handlers. The only kimi-specific handler bodies are the two reasoning-handling additions described above.

Mirrors the structure of test_mistral.lua: a top-level adapter set with a pre_case hook that resolves the kimi adapter, then three nested groups — form_messages, Streaming, and No Streaming — using the same hook shape and test phrasing where behaviour overlaps. Standard tests (mirrored from mistral): - form_messages: basic, with tools, and form_tools after extend() - Streaming: chat-buffer output (stream = true pre_case) - No Streaming: chat-buffer output, tools, inline assistant (stream = false pre_case) Kimi-specific additions cover the reasoning_content round-trip the adapter exists for: - form_messages rewrites m.reasoning into a flat reasoning_content string on assistant messages. - When think=true, an empty-string reasoning_content is inserted on assistant tool-call messages with no captured reasoning, satisfying Moonshot's validator on history that pre-dates the adapter. - When think=false, no fallback is inserted (negative case). - Streaming "can process thinking" walks chat_output → parse_message_meta and asserts both content and reasoning aggregate correctly. Stubs follow the OpenAI Chat-Completions wire format (the streaming stub uses delta.reasoning_content; the tools-no-streaming stub includes a flat reasoning_content string on the assistant message). Files added: - tests/adapters/http/test_kimi.lua - tests/adapters/http/stubs/kimi_streaming.txt - tests/adapters/http/stubs/kimi_no_streaming.txt - tests/adapters/http/stubs/kimi_tools_no_streaming.txt

Adds the kimi adapter to the supported-LLMs list (README, doc/index.md, regenerated doc/codecompanion.txt) and a Setup Examples entry in doc/configuration/adapters-http.md. The setup example covers: - Minimal config (just MOONSHOT_API_KEY plus interactions.chat.adapter). - Overriding the API-key source via the cmd: prefix (1Password CLI example) and switching the URL for region-specific endpoints (e.g. api.moonshot.cn). - Schema overrides for `model` and `think` so users can pick a non- thinking K2 model or disable thinking on the K2-thinking variants. - An IMPORTANT callout that kimi-k2-thinking pins temperature=1 and top_p=0.95 server-side, matching the adapter's defaults. The example is placed between the llama.cpp and Ollama sections — both neighbours involve OpenAI-compatible reasoning configuration, which keeps the page topically grouped.

olimorris · 2026-05-03T11:33:12Z

Hey @georgeharker. Appreciate you taking the time to make a PR for this.

I'm at the limit of what I want to merge into main for both ACP and HTTP adapters. The adapter system is robust enough that I any new additions to be created and shared by the community. I don't want CodeCompanion to turn into that plugin that becomes a repository for adapters as I noted in #3053 when I closed that.

georgeharker · 2026-05-03T11:35:08Z

Totally understood!

In which case I'll re-spin this as a plugin

georgeharker added 3 commits May 3, 2026 11:42

georgeharker marked this pull request as draft May 3, 2026 11:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(adapters): add Kimi (Moonshot) HTTP adapter with thinking support#3087

feat(adapters): add Kimi (Moonshot) HTTP adapter with thinking support#3087
georgeharker wants to merge 3 commits intoolimorris:mainfrom
georgeharker:kimi-adapter

georgeharker commented May 3, 2026

Uh oh!

olimorris commented May 3, 2026

Uh oh!

georgeharker commented May 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

georgeharker commented May 3, 2026

Description

AI Usage

Related Issue(s)

Screenshots

Checklist

Uh oh!

olimorris commented May 3, 2026

Uh oh!

georgeharker commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

georgeharker commented May 3, 2026 •

edited

Loading