NewsletterThe Batch· 06-14 · 07:17

Qwen3.7-Max 向 Google 争第三、AI 拯救鲸鱼、微调破坏版权对齐

Qwen3.7-Max Challenges Google for Third Place, AI Saves Whales, Fine-Tuning Breaks Copyright Alignment

打开原文约 75 分钟读

Dear friends,

If you haven’t already, I encourage you to experiment with using AI agents not just to chat but to actually do work for you on your desktop. Desktop agents not only chat with you but also read and edit local files, read/send messages, and provide scheduled deliverables like a daily news summary. While there's nothing wrong with copy-pasting output from web-based chatbots to a desktop or dragging and dropping files into chatbots to give them context, desktop agents can gain context more efficiently as well as take actions directly.

The main way such an agent is built involves creating a set of tools (function calls) for tasks such as file access, web search/web fetch, messaging app integration, and so on; providing these tools to a frontier LLM;and setting up permissions and guardrails. Then you prompt the LLM and let it pick when to use what tool to move forward on a task. The software that wraps around the LLM to implement a desired agentic system is called the agent harness, and it enables the LLM to drive the key loop that decides what to do next at each step.

So far, most practicalAgentic AIworkflows (except for coding agents) have not relied on the LLM to this extent to decide what to do next. Instead, they have relied more on developer-specified workflows to deliver higher reliability. But in the past few months, frontier LLMs have advanced sufficiently for this style of harness design to provide an important, if still not entirely reliable, alternative.

CLI (command line interface) coding agents (like Claude Code, Codex CLI, Antigravity CLI, and OpenCode) have been the main type of agent that uses an LLM to drive the next action. But there’s also value to non-CLI agents with easy-to-use interfaces. More precisely, consumers currently interact with AI systems through three key interfaces: (i) chat interfaces (like the web version of ChatGPT), (ii) coding CLI tools, and (iii) desktop agents that can carry out tasks.

I do not use existing commercial desktop agents for highly confidential tasks, since I’m uncomfortable with some of their data-retention policies, which are often buried in obscure legalese and might change overnight with a new model (as we just saw with Anthropic’s Fable release). Also, if you make a small misstep, it may have unexpected legal consequences such as losing legal privilege to confidential documents.

In light of these concerns, my collaborators Rohit Prsad, Devika Verma, and I have been working on a free, open-source alternative:OpenCoworker.This is an open-source project we put together while extendingaisuiteto support agent harnesses. If you’re interested in learning more about agentic harnesses, you might enjoy checking out the code.

Using OpenCoworker requires your own API key from OpenAI, Anthropic, Google, or another provider, or you can run a local model using Ollama so nothing ever leaves your machine. Some of the data integrations, such as email, are still difficult to set up (comparable in difficulty to what users of other open-source projects such as OpenClaw or Hermes Agent users might have experienced). It saves its memory on your computer, and you can choose a LLM provider with a zero data-retention policy, local inference, or other options depending on your privacy requirements.

My teams have been experimenting with OpenCoworker for a wide range of tasks like messaging automation, creating documents, and workflow automation. This is a work in progress, and I hope the open-source community will ensure that there is a viable, open, desktop agent option that is comparable or superior to the closed ones. We are working to make OpenCoworker easier to use, and welcome contributions as well asfeedback!

Keep building!

Andrew


A MESSAGE FROMDEEPLEARNING.AI

Learn to run open-source LLMs faster with vLLM. Quantize a model, serve it efficiently, and benchmark performance so you can make informed tradeoffs between speed, cost, and accuracy.Enroll for free

News

Behold Mythos!

After months of headlines that teased a large language model with extraordinary capabilities, Anthropic launched Claude Mythos 5, which can crack software previously believed to be secure, and Claude Fable 5, a version for general use that limits what users can do in an unprecedented way.

What’s new:Claude Mythos 5 and Claude Fable 5update Claude Mythos Preview, which has received strictly limited distribution since its rollout in early April. The two new models are identical, except Claude Fable 5 doesn’t respond to prompts related to security, biology, chemistry, or distillation and degrades its responses to prompts about building cutting-edge AI. They set new states of the art in a variety of areas including software engineering and knowledge work, and they’re priced at around half the price of Claude Mythos 5 Preview and twice the price of Claude Opus 4.8, Anthropic’s previous flagship model.

How it works:Anthropic disclosed little information about how it built Claude Mythos 5 and Claude Fable 5. Claude Mythos 5 is fine-tuned for alignment but not designed to be “safe for general use.” On the other hand, Claude Fable 5 implements extra layers of precaution, according to a lengthysystem card. Anthropic advises that these precautions are not perfect and could hinder performance inappropriately.

Performance:Independent evaluations were not available for Claude Mythos 5 at the time of this publication. Anthropic says its capabilities match those of Claude Fable 5, which Artificial Analysisrankedat the top of its Intelligence Index as well as several of the index’s component evaluations.

Safety concerns:Anthropic rates Claude Mythos 5’s and Claude Fable 5’s propensity to take actions that betray the user’s intentions “very low.” Nonetheless, it expressed concerns over Claude Mythos 5’s potential to behave in undesirable ways or help malevolent users — concerns it has addressed in Claude Fable 5.

Controversy:At its debut, Claude Fable 5 had a further limitation that Anthropic since has modified.

Why it matters:Since April, when Anthropic revealed Claude Mythos Preview, security personnel have been working to prepare their operations for an AI-assisted onslaught, while the public has wondered just what this new class of model can accomplish. Indeed Claude Mythos 5 and Claude Fable 5 represent a significant improvement, notably in AI-assisted coding. While some observers view Anthropic’s emphasis on Mythos-class safety skeptically, sensing an effort to persuade the market that it has the most powerful technology, the bifurcation of Mythos into a fully capable model that has limited distribution and a guardrailed version for general use is reasonable while security teams continue their work.

We’re thinking:These models are impressive! But Anthropic’s decision to degrade Claude Fable 5’s ability to help developers build technology that might compete with Anthropic’s raises concerns — even if users are notified when it happens. Users should be able to use products as they wish for any legitimate purpose. Imagine Microsoft telling developers they couldn’t use Windows to build applications that competed with its own applications, or Google saying you couldn’t use its web search to find information on how to build a company that would compete with it! A fair, level playing field, as well as openness in technology and research, will result in better outcomes in the long term.


Cursor Fits Its Model to Its Agent

Cursor’s latest software engineering model rivals the performance of leading competitors like Claude Opus 4.7 and GPT 5.5 for a fraction of the price.

What's new:Composer 2.5, the native model for the Cursor agentic software-development environment, improves upon Composer 2, released in March. Like its predecessor, Composer 2.5 is based on Moonshot’s open-weights Kimi K2.5.

How it works:Composer 2.5 is built specifically for agentic coding. Cursor detailed its training recipe for Composer 2 in apaperand followed it for Composer 2.5. The authors took Kimi 2.5’s open weights and conducted further pretraining on a large dataset of code. They used reinforcement learning to fine-tune the resulting model using a simulated agentic harness and tools that matched Cursor CLI, the company’s own coding harness. During reinforcement learning, the model was rewarded not only for success but also for brevity and elegance of its output. The team updated the earlier training process as follows:

Performance:Composer 2.5 placed third behind Claude Opus 4.7 and GPT-5.5 on a number of independent coding benchmarks, but pulled ahead on Cursor’s own CursorBench when all tested models used default settings in Cursor CLI. It runs faster and less expensively than comparable models, typically exceeding the cost of only DeepSeek V4 Pro.

Behind the news:In April, SpaceX obtained the right toacquireCursor for $60 billion or pay $10 billion for their work together, as part of a broader partnership deal. Cursor will train its models using SpaceX hardware, and it is training new models from scratch, so it may not rely Moonshot’s open-weights offerings for much longer.

Why it matters:It has become common to say that harness engineering – creating the software and tooling that allows models to perform agentic tasks – is becoming as important as the underlying models themselves. By developing Composer as a specialized software engineering model, Cursor rejects that dichotomy. Fine-tuning models within the harness gives users the best of both worlds: The strengths of the model and the surrounding software are built to work together.

We’re thinking:New software-engineering models appear less frequently than they once did, as generalist models from Anthropic and OpenAI have captured the market with their versatile models operating inside their popular Claude Code and Codex coding tools. Indeed, Cursor began as an integrated developer environment (IDE) and sold access to those models long before it developed its own. But Cursor doesn’t need a generalist model; it needs one that can solve developer’s problems at high speed and low cost. Its continued success shows that there’s still a place for specialist models trained for specific tasks, even — or especially — in the agentic era.


RSI Is the New AGI

The phrase _recursive self-improvement_ erupted on social media following an Anthropic report that tracked AI-driven gains in the company’s internal software-engineering productivity.

What’s new:80 percent of Anthropic’s code is authored by Claude, up from less than 5 percent before the preview release of Claude Code, the companywrotein a blog post, adding that the trend points toward AI systems that “design and refine themselves.” Anthropic’s report thrust the theoretical notion of recursive self-improvement (RSI) into the spotlight, further dividing the AI community between those who call for drastic measures to forestall a dystopian future and those who caution that unrealistic fears will severely undermine the good that AI can do.

Rising productivity:Anthropic measured rising software-development productivity attributable to AI and extrapolated a few scenarios for the future.

Bandwagons and skeptics:Anthropic isn’t the only organization in the AI community thinking about RSI, but responses range from skeptical to bullish.

Behind the news:Recursive self-improvement traces back to early ideas about “intelligence explosions,” most famouslyarticulatedby I. J. Good in 1965, who argued that a sufficiently advanced machine intelligence could improve its own design and rapidly surpass human intelligence. In the 2000s and 2010s, Eliezer Yudkowsky at UC Berkeley’s Machine Intelligence Research InstituteformalizedRSI as a central concern in AI alignment research. The idea re-entered mainstream AIresearchwith the rise of large language models and AI-assisted coding. A team at Chinese Information Processing Laboratory and elsewhere recently proposed a benchmark,Meta-Agent Challenge, to evaluate AI systems’ capacity for RSI.

Why it matters:The potential for RSI, like the potential for artificial general intelligence (AGI), is distant. A great deal of work and likely a number of breakthroughs stand between present systems, which increasingly multiply human productivity in software development and other fields, and systems will oversee, design, and engineer their own improvements in a recursive loop that, once it begins, continues _ad infinitum_. Meanwhile, the AI community is divided over exaggerated notions of AI-related danger and here-and-now risks to ongoing innovation and the benefits it can bring. Science-fiction scenarios may be effective if the goal is to scare people or persuade them to give you money, but realistic visions of possible futures are necessary to make real progress.

We’re thinking:In its blog post, Anthropic revived the idea of a global, temporary pause in AI research. Although it doesn’t advocate stopping all research — as the Future of Life Institutedida few years ago — it put this idea back on the table. It’s a bad one, and it empowers doomsayers whose fears are at best unrealistic and at worst self-serving. We wholeheartedly supportregulation of dangerous applications, but we should continue to improve fundamental technology as quickly as possible.


Popular large language models have adopted the biases of governments that control the free flow of information, particularly when those models generate output in the languages of countries where such governments are in power, researchers found.

What’s new:Writing produced by organizations that are associated with governments is widespread in datasets that are used to train large language models, and it influences the responses of models built by Anthropic and OpenAI, according to astudyby Hannah Waight, Eddie Yang, and colleagues at the University of Oregon, Purdue University, University of California San Diego, New York University, and Princeton University. For instance, China has extensive state-media operations and relatively few independent publishers; and when prompted in Chinese, those LLMs express a more positive attitude toward the Chinese government than they do when prompted in English.

Key insight:Large language models are trained to reproduce an immense amount of material scraped from the web. In countries where media is controlled by the government, a relatively large percentage of material that’s distributed online expresses the government’s point of view without acknowledging other perspectives. Thus, state media has an outsize influence on the output of large language models. Large volumes of state media are not necessary to produce a significant effect. For instance, much of the Chinese-language text on the web is based on official publications, and consequently, Chinese state media exerts a significant influence on Chinese-language LLM output.

How it works:The authors devised a variety of tests to reveal the impact of state media responses to prompts in various languages. They ranked countries according to state-media dominance based on the World Press Freedom Index and tested prompts in a wide variety of languages, including official national languages and related languages, as well as foreign languages. Much of the study focuses on Chinese and English. They tested models built by Anthropic including Claude 3 Sonnet and OpenAI including GPT-4o.

Behind the news:It isdocumentedthat most LLMs are biased towards western, educated, industrialized, rich, and democratic values. However, those studies were conducted using primarily English-language prompts, which the current study found to be a key variable. A 2025studyalso found that LLMs profess different moral attitudes in different languages (in response to statements like “Caring for people who have suffered is an important virtue.”)

Why it matters:LLMs are increasingly a go-to source of information for millions of people worldwide. Typically, they don’t cite sources of information they learn from their training data, leaving users in the dark about their influences. Consequently, models may promote agendas that are at odds with values of users and the societies in which their they live and work.

We’re thinking:LLMs are known to be persuasive. This study assumes that state-controlled media wasn’t created deliberately to influence language models – that such influence is a side effect. But it also reveals an obvious incentive for governments and other political actors to influence LLM training data more directly, and by extension, influence national and global politics.

这篇还没有中文全文

该条目暂未提供中文翻译。标题/摘要已自动中译;本系统只对人工挑选的内容生成全文翻译。

挑中后 → markitdown 取正文 → 精翻 → 此处切换为译文