← MarkTechPost
资讯MarkTechPost· 06-13 · 04:57

Moonshot AI 发布 Kimi K2.7-Code:Kimi Code Bench v2 上较 K2.6 提升 21.8% 的编码模型

Moonshot AI Releases Kimi K2.7-Code: a Coding Model Reporting +21.8% on Kimi Code Bench v2 Over K2.6

打开原文约 19 分钟读

This week, Moonshot AI released Kimi K2.7-Code. It is a coding-focused, agentic model. The model weights ship on Hugging Face under a Modified MIT license. You can also reach it through the Kimi API and Kimi Code.

K2.7-Code targets long-horizon software engineering, not general chat. It plans, edits, runs tools, and debugs across many steps. Moonshot pairs the model with a subscription coding platform around it.

Kimi K2.7-Code

K2.7-Code is a Mixture-of-Experts model. It holds 1T total parameters and activates 32B per token. The design uses 384 experts, with 8 selected per token and 1 shared. It has 61 layers, including 1 dense layer.

Attention uses MLA, and the feed-forward path uses SwiGLU. A MoonViT vision encoder adds 400M parameters for image and video input. The model ships with native INT4 quantization. The context window is 256K tokens (262,144).

Two constraints matters: Thinking mode is mandatory; disabling it returns an API error. Sampling is fixed: temperature 1.0, top_p 0.95, n 1, penalties 0.0. Default max output is 32,768 tokens.

You can self-host with vLLM, SGLang, or KTransformers. The Hugging Face repository is large, roughly 595 GB on disk. This is a server-class deployment target, not a laptop model.

Benchmark

Moonshot team published six benchmark rows. They compare K2.7-Code against K2.6, GPT-5.5, and Claude Opus 4.8. K2.7-Code beats K2.6 on every row. The largest coding jump is Kimi Code Bench v2, from 50.9 to 62.0.

BenchmarkKimi K2.6Kimi K2.7-CodeGPT-5.5Claude Opus 4.8K2.7 vs K2.6
Kimi Code Bench v250.962.069.067.4+21.8%
Program Bench48.353.669.163.8+11.0%
MLS Bench Lite26.735.135.542.8+31.5%
Kimi Claw 24/7 Bench42.946.952.850.4+9.3%
MCP Atlas69.476.079.481.3+9.5%
MCP Mark Verified72.881.192.976.4+11.4%

K2.7-Code does beat Opus 4.8 on MCP Mark Verified, 81.1 versus 76.4. It also lands close to GPT-5.5 on MLS Bench Lite. K2.7-Code ran in Kimi Code CLI, GPT-5.5 in Codex xhigh, and Opus 4.8 in Claude Code xhigh.

Reasoning-Token Efficiency: A Cost Claim, Not Just Quality

Moonshot team reports about 30% lower reasoning-token usage than K2.6. It frames this as ‘less overthinking.’

Reasoning tokens bill as output tokens on most price cards. Agentic coding runs hundreds or thousands of steps. Each plan, retry, and verification pays the thinking cost again. A 30% cut compounds across a long run.

The effect lands in three places at once. First, lower output-token cost per task. Second, faster steps, which helps interactive CLI sessions. Third, more steps before hitting context limits.

Use Cases With Examples

Marktechpost’s Interactive Explorer

Kimi K2.7-Code — Interactive Explorer

Company-reported benchmarks and official API pricing. Released June 12, 2026. Verified June 12, 2026.
Benchmarks
Cost Calculator
Specs
Source: Moonshot AI Kimi K2.7-Code model card. K2.7-Code ran in Kimi Code CLI; GPT-5.5 in Codex xhigh; Claude Opus 4.8 in Claude Code xhigh. First-party numbers, not an independent leaderboard.
Input tokens / run: 50,000
Output tokens / run: 8,000
Cache hit rate: 50%
Runs / month: 1,000
Reasoning share of output: 40%
Input cost$0.00
Output cost$0.00
Est. monthly total$0.00
$0.00
Rates: cached input $0.19 / 1M, cache-miss input $0.95 / 1M, output $4.00 / 1M (official Kimi pricing). Savings line illustrates K2.7-Code’s reported ~30% lower reasoning-token usage vs K2.6, applied to the reasoning share of output. Estimate only.
Source: Kimi K2.7-Code Hugging Face model card and Kimi API docs.

A Minimal Quickstart

The Kimi API is OpenAI-compatible. The model string is kimi-k2.7-code. Do not override the fixed sampling parameters, or the request errors.

import os
from openai import OpenAI

# Base URL and key per the Kimi API docs at platform.moonshot.ai
client = OpenAI(
    api_key=os.environ.get("MOONSHOT_API_KEY"),
    base_url="https://api.moonshot.ai/v1",
)

messages = [
    {"role": "system", "content": "You are a coding agent."},
    {"role": "user", "content": "Refactor utils.py to remove duplicate code."},
]

resp = client.chat.completions.create(
    model="kimi-k2.7-code",
    messages=messages,
    max_tokens=32768,  # default cap; also the maximum
    # thinking is enabled by default and cannot be disabled.
    # temperature (1.0), top_p (0.95), n (1), and penalties (0.0) are
    # fixed server-side. Passing any other value returns an error.
)

msg = resp.choices[0].message
print(msg.content)

# Multi-step tool calls: append the full assistant message so that
# reasoning_content is preserved. Dropping it errors on the next turn.
# messages.append(msg.model_dump())

Two tool-use rules come from the docs. Keep reasoning_content from the current turn in context. And set tool_choice to only "auto" or "none".

How K2.7-Code Compares

ModelLicenseParamsContextAPI price (in / out per 1M)
Kimi K2.7-CodeModified MIT (open)1T total / 32B active256K$0.95 / $4.00
Kimi K2.6Open-weight1T-class MoE256K~$0.67–0.95 / ~$3.39–4.00
GPT-5.5ClosedNot disclosedNot in Moonshot table
Claude Opus 4.8ClosedNot disclosed1M$5.00 / $25.00
Qwen3-Coder-480B-A35BOpen (Qwen license)480B / 35B active256K nativeVaries by host

K2.7-Code lists $0.19 per 1M for cached input.

Strengths and Weaknesses

Strengths:

Weaknesses:

Key Takeaways


Check out the Model weight, Kimi Code and APIAlso, feel free to follow us on Twitter and don’t forget to join our 150k+ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us

The post Moonshot AI Releases Kimi K2.7-Code: a Coding Model Reporting +21.8% on Kimi Code Bench v2 Over K2.6 appeared first on MarkTechPost.

这篇还没有中文全文

该条目暂未提供中文翻译。标题/摘要已自动中译;本系统只对人工挑选的内容生成全文翻译。

挑中后 → markitdown 取正文 → 精翻 → 此处切换为译文