Since the start of 2026, a new player has muscled its way into the AI coding agent race: Kimi Code, built by the Chinese startup Moonshot AI. Powered by the Kimi K2.5 model and its 1,040 billion parameters, this open source command-line tool, released under the Apache 2.0 license, posts a score of 76.8% on SWE-bench Verified, making it the best-performing open source model for coding to date. Up against Claude Code and OpenClaw, this challenger out of Beijing deserves a serious technical look.
Moonshot AI: the startup behind Kimi Code
Moonshot AI was founded in March 2023 by Yang Zhilin, a Tsinghua graduate with a PhD from Carnegie Mellon, alongside Zhou Xinyu and Wu Yuxin, both also Tsinghua alumni. Yang Zhilin is no stranger to the NLP field: he is a co-author of Transformer-XL and XLNet, two foundational papers in language-model architecture. He worked at Google Brain and Meta before founding his own company.
On the funding side, Moonshot AI has raised heavily:
- February 2024: $1 billion led by Alibaba and HongShan (formerly Sequoia China), valuing the company at $2.5 billion
- August 2024: an additional $300 million with Tencent and Gaorong Capital, valuation at $3.3 billion
- January 2025: $500 million in a Series C led by IDG Capital, valuation at $4.3 billion
- Early 2026: valuation estimated at $4.8 billion
The ambition Yang Zhilin has set out: reaching AGI through three stages, long context, multimodal models, and a general architecture capable of continuous self-improvement.
Kimi K2.5: the model under the hood
Kimi Code is built on Kimi K2.5, a natively multimodal model (text, images, video) released as open source in late January 2026. The technical specs are impressive:
- 1.04 trillion total parameters, a Mixture-of-Experts (MoE) architecture with 384 experts
- 32 billion active parameters per inference (only 3.2% of the model activated on each request)
- 256K tokens of context thanks to MLA (Multi-head Latent Attention), which shrinks the KV cache by a factor of 10
- 15 trillion training tokens mixing visual and textual data
- 61 layers, attention dimension of 7168
Benchmarks: where does K2.5 stand?
On coding benchmarks, Kimi K2.5 positions itself as the best open source model:
- SWE-bench Verified: 76.8% (versus 80.8% for Claude Opus 4.6, though the latter is proprietary)
- Particularly strong in front-end development and in tool-driven agentic tasks
- Competitive with Claude Sonnet 4.5, with a gap of 4 to 6 points depending on the benchmark
- Outperforms Claude Opus by 7 points on tool-augmented agentic tasks
PARL: the key architectural innovation
The real innovation in Kimi K2.5 lies in its training paradigm: PARL (Parallel-Agent Reinforcement Learning). The principle: an orchestrator agent breaks complex tasks down into parallelizable subtasks, each executed by sub-agents instantiated dynamically.
During training, only the orchestrator is updated through reinforcement learning. The sub-agents are frozen and their execution trajectories excluded from optimization. This separation solves two classic problems of end-to-end co-training: credit-assignment ambiguity and training instability.
The result in production is spectacular: the Agent Swarm system coordinates up to 100 sub-agents in parallel and can run up to 1,500 tool calls per session. Internal evaluations report an 80% reduction in end-to-end execution time and a 3x to 4.5x reduction in the number of critical steps required.
Kimi Code CLI: installation and getting started
Kimi Code ships as an open source Python package under the Apache 2.0 license. Installation is straightforward:
# Install via pip
pip install kimi-cli
# Install via Homebrew (macOS/Linux)
brew install kimi-cli
# Or from source
git clone https://github.com/MoonshotAI/kimi-cli.git
cd kimi-cli
make build
Once installed, Kimi Code works like a classic terminal agent:
# Launch Kimi Code inside a project
cd /path/to/your/project
kimi
# Usage examples
kimi "Analyze the structure of this project and identify the dependencies"
kimi "Fix the bug in the authentication module"
kimi "Generate the unit tests for UserService"
One interesting quirk: Kimi Code is not just a coding agent, it is also an interactive shell. By pressing Ctrl+X, you switch into shell command mode and run commands directly without leaving the Kimi environment.
Features: MCP, multimodal and IDE integrations
Native MCP support
Kimi Code natively supports the Model Context Protocol (MCP), the open standard initiated by Anthropic to connect AI models to external tools and data sources. The CLI ships with a full set of MCP management commands, letting you plug third-party MCP servers straight into your working sessions.
Native multimodality
Thanks to Kimi K2.5, the CLI brings multimodal capabilities:
- Vision-to-UI: send a screenshot or a mockup, and Kimi Code generates the corresponding front-end code
- Image analysis: bug screenshots, architecture diagrams, technical schematics
- Video input: the model can analyze videos to understand a workflow or reproduce a behavior
This is a concrete edge over most CLI competitors, which remain limited to text.
IDE integrations
Kimi Code is not confined to the terminal. It integrates with:
- VS Code via the official Kimi Code extension
- Cursor and Zed via the ACP (Agent Client Protocol)
- Any ACP-compatible editor
Kimi Code vs Claude Code vs OpenClaw: the field comparison
The market for CLI coding agents has turned into a genuine battlefield in early 2026. Here is a factual comparison of the three main players:
| Criterion | Kimi Code | Claude Code | OpenClaw |
|---|---|---|---|
| Model | Kimi K2.5 (1T MoE) | Claude Opus 4.6 / Sonnet 4.5 | Multi-LLM (Claude, GPT, DeepSeek...) |
| SWE-bench | 76.8% | 80.8% (Opus 4.6) | Depends on the LLM used |
| License | Apache 2.0 | Proprietary (subscription) | MIT |
| Context | 256K tokens | 1M tokens (Opus) | Depends on the LLM |
| Multimodal | Text + images + video | Text + images | Mainly text |
| API cost | ~$0.45/M input, $2.50/M output | ~$5/M input, $25/M output (Opus) | Varies by LLM |
| MCP | Yes (native) | Yes (native) | Via plugins |
| Agent Swarm | Up to 100 sub-agents | No (sequential) | No |
In practice
Field feedback paints a nuanced picture. Claude Code remains superior in terms of contextual consistency over long sessions, the quality of its built-in web search, and fine-grained permission management. Its ecosystem is more mature and reliability is there.
Kimi Code shines on value for money. At roughly $0.45 per million input tokens versus $5 for Claude Opus, the ratio is nearly 1 to 10. Its Agent Swarm architecture, capable of massively parallelizing tasks, is a real advantage on large-scope projects. Native multimodality (video included) is also a concrete differentiator.
The weak spot most often flagged: reliability. Several developers report having to rerun requests 2 to 3 times to get a satisfactory result, which erodes the theoretical economic advantage. Context handling over long sessions remains behind Claude Code.
The geopolitical angle: China in the AI agent race
The release of Kimi K2.5 and Kimi Code does not read solely through a technical lens. It is also a strategic move in the Sino-American tech war.
Since the US restrictions on chip exports (NVIDIA A100/H100 GPUs), Chinese AI companies have had to innovate under constraints. Moonshot AI's answer, like DeepSeek's before it, runs through open source. By releasing Kimi K2.5 under Apache 2.0 and the CLI under the same license, Moonshot positions itself as a contributor to the global ecosystem rather than an exporter of strategic technology, an important diplomatic nuance.
This strategy serves several goals:
- Global adoption: a high-performing open source model attracts developers, builds an ecosystem, and reduces dependence on US players
- Talent recruitment: open source visibility draws in contributors and researchers
- Technological resilience: in the event of further restrictions, a distributed ecosystem is harder to block
- Scientific legitimacy: publishing on arXiv and HuggingFace places Moonshot in the global academic debate
The fact that a Chinese startup valued at $4.8 billion is releasing its best-performing model as open source sends a strong signal. The era when Chinese models were seen as mere GPT clones is over. Kimi K2.5 genuinely innovates, notably with PARL and Agent Swarm, approaches that have no direct equivalent among Western players.
Strengths, limitations and verdict
Kimi Code's strengths
- Unbeatable usage cost (8 to 10x cheaper than Claude Opus)
- Open source Apache 2.0: the CLI code and the model weights are both accessible
- Agent Swarm: massive parallelization, a first in a CLI tool
- Full multimodality: text, images and video as native input
- 256K context window: enough for the majority of projects
- MCP and ACP: interoperable with the existing tooling ecosystem
Current limitations
- Lagging reliability: sometimes inconsistent results that require reruns
- Immature ecosystem: documentation, community and plugins less developed than Claude Code's
- Smaller long context: 256K versus 1M tokens for Claude Opus
- Long-session consistency: a tendency to lose the thread over extended interactions
- API dependence: the CLI is open source, but inference runs through Moonshot's servers (unless you self-host, which demands substantial infrastructure for a 1T-parameter model)
Who is it for?
Kimi Code is a sensible choice for developers who:
- Work on front-end projects where design-to-code conversion is frequent
- Are looking for a cost-effective CLI agent for repetitive coding tasks
- Want to experiment with Agent Swarm to parallelize massive refactorings
- Prefer an open source solution that is auditable and modifiable
For critical projects requiring maximum reliability and contextual consistency over long sessions, Claude Code remains the safest choice in 2026. But the gap is narrowing, and the competitive pressure coming from China is pushing the whole industry upward.
The coding agent market is in full structuring. Between Claude Code, OpenClaw, Kimi Code and alternatives like Gemini CLI, developers have never had so much choice. The real revolution of vibe coding may be right here: not a single tool, but a competitive ecosystem where each player pushes the others to improve.
Comments