On Monday, Anthropic accused three Chinese AI labs of running 24,000 fake accounts and 16 million exchanges to steal Claude’s capabilities. Their own statement tells you everything you need to know about what comes next: “No single company can solve this alone.”
They’re right. And nobody will.
I’ve spent the last several weeks documenting the intelligence revolution as it unfolds — the safety chief who walked out, the $285 billion that vanished, the DeepMind founder who told us 10 years happens every year. But this week’s story isn’t about technology accelerating. It’s about something more fundamental.
It’s about why the most powerful technology ever created will almost certainly emerge into a world with no coordinated governance. Not because people aren’t trying. But because coordination is mathematically impossible under current conditions.
What Just Happened
On February 24, Anthropic published a detailed accusation: three Chinese AI labs — DeepSeek, Moonshot AI, and MiniMax — had conducted “industrial-scale distillation campaigns” against Claude. The numbers are staggering: 24,000 fraudulent accounts. Over 16 million exchanges. Carefully crafted prompts designed to extract Claude’s most valuable capabilities — agentic reasoning, tool use, coding, chain-of-thought reasoning.
DeepSeek’s operation was the most sophisticated. Anthropic says their prompts asked Claude to “imagine and articulate the internal reasoning behind a completed response and write it out step by step” — essentially tricking the model into generating its own training data. They also extracted responses on politically sensitive topics about “dissidents, party leaders, or authoritarianism” — likely to train their own models to steer conversations away from censored subjects.
MiniMax ran the largest campaign — 13 million exchanges. When Anthropic released a new model during the campaign, MiniMax pivoted within 24 hours, redirecting half its traffic to capture capabilities from the latest system. Moonshot AI generated 3.4 million exchanges targeting agentic reasoning, tool use, and computer vision.
This isn’t a one-off. Two weeks earlier, OpenAI sent a memo to Congress making similar accusations against DeepSeek. The same day as Anthropic’s announcement, Google’s Threat Intelligence Group reported distillation attacks on Gemini using over 100,000 prompts. Every major American AI lab is being systematically harvested.
The infrastructure enabling these campaigns is itself a story of coordination failure. The Chinese labs didn’t access Claude directly — Anthropic doesn’t offer commercial access in China. Instead, they used commercial proxy services that resell access to frontier AI models at scale. Anthropic describes these as “hydra cluster architectures” — sprawling networks of fraudulent accounts that distribute traffic across third-party APIs and cloud platforms. In one case, a single proxy network managed more than 20,000 fraudulent accounts simultaneously, mixing distillation traffic with legitimate customer requests to avoid detection.
Think about what that means. There’s now an entire shadow economy built around extracting capabilities from frontier AI models. These proxy services operate across jurisdictions, serve multiple clients, and have no incentive to enforce any nation’s terms of service. They’re the dark pools of the AI race — invisible, cross-border, and effectively ungovernable.
But here’s what matters most — not the theft itself, but what it reveals about the coordination problem that will define how superintelligence enters the world.
The Game Theory Nobody Wants to Discuss
I’ve advised organizations across more than 20 countries on AI strategy. In every conversation — with government officials, corporate leaders, military planners — I eventually arrive at the same uncomfortable truth: everyone agrees global AI coordination is important. Everyone agrees on almost nothing else.
The distillation story is a textbook Prisoner’s Dilemma playing out in real time.
Consider the two-player version between the US and China. If both develop AI safely and slowly, both arrive at powerful systems together — the best collective outcome. If one develops fast while the other is cautious, the fast mover gets a decisive advantage. If both race, both arrive at powerful systems without adequate safety — the worst collective outcome.
The Nash Equilibrium — the rational choice for each player given the other’s likely behavior — is to race. Even though mutual caution would be better for everyone.
DeepSeek’s distillation campaign is what “racing” looks like in practice. They didn’t wait for a coordination framework. They didn’t respect terms of service or regional access restrictions. They built 24,000 fake accounts and extracted capabilities as fast as they could. Because in an uncoordinated world, the rational move is to take whatever advantage you can get.
And the speed is telling. When Anthropic released a new model during an active campaign, MiniMax redirected half its traffic within 24 hours to capture the latest capabilities. That’s not rogue actors freelancing. That’s systematic, adaptive capability extraction operating at a pace no governance framework could match.
Now add more players. It’s not just the US and China. It’s also the EU with its AI Act, the UK with its safety institute, Russia with its military AI programs, dozens of private companies with no national loyalty, and open-source communities that make capabilities freely available to anyone. The multi-player version of this game is exponentially harder to solve. With two players, you need one agreement. With six major players, you need fifteen bilateral agreements — or one multilateral framework that all six accept. History gives us almost no examples of that working on technology with military applications.
Anthropic’s response is revealing. They’ve implemented detection systems, behavioral fingerprinting, enhanced verification. They’re sharing threat intelligence with other AI labs. But their own conclusion is devastating: “Distillation attacks at this scale require a coordinated response across the AI industry, cloud providers, and policymakers.”
They’re calling for coordination from inside a system structurally incapable of producing it.
Why History Offers No Comfort
Before anyone points to nuclear non-proliferation or chemical weapons conventions as models, let me explain why AI coordination is harder than any previous technology governance challenge.
Nuclear weapons required nation-state resources — uranium enrichment, plutonium production, massive industrial infrastructure. The barriers to entry were enormous, which made coordination among a small number of players at least conceivable. AI requires a few thousand GPUs and clever algorithms. The barriers to entry are low and falling. DeepSeek demonstrated this when they released R1 last year — approaching frontier performance at dramatically lower cost. Researchers at UC Berkeley recreated a comparable reasoning model for $450 in 19 hours. Stanford and University of Washington researchers did it in 26 minutes for under $50.
Climate change coordination has been attempted for decades with limited success, despite existential stakes. But climate change operates on decades-long timescales that at least theoretically allow for iterative governance. AI operates on timescales of weeks and months. The distillation campaigns adapted faster than any governance mechanism could respond.
Biological weapons conventions exist but have limited enforcement — precisely the same weakness any AI governance framework would face. And gain-of-function research, which poses similar dual-use risks to AI, has no effective global coordination despite years of effort.
The pattern is clear: humans are bad at coordinating on long-term existential threats, especially when short-term advantages are on the table. AI adds a dimension previous technologies lacked — it evolves faster than our institutional capacity to govern it.
The AI Governance Trilemma
In economics, there’s a concept called the “impossible trinity” — you can have free capital flows, fixed exchange rates, or independent monetary policy, but you can’t have all three simultaneously. I’ve identified an equivalent in AI governance.
Nations can have two of these three things, but never all three:
Strong AI safety protections. Rapid AI innovation and deployment. Global competitiveness.
The US wants all three. So does China. So does the EU. And the distillation story shows exactly why you can’t have them.
Anthropic builds safety guardrails into Claude — protections against bioweapons synthesis, malicious code generation, disinformation. These represent the “strong safety” corner of the trilemma. But those guardrails slow development and add cost — tension with “rapid innovation.” And when Chinese labs distill Claude’s capabilities into their own models, the safety guardrails get stripped out entirely. The distilled models retain the capabilities but not the protections.
This is the trilemma made concrete. Anthropic invests in safety, competitors extract the capability without the safety overhead, and the competitive landscape punishes the company that tried to be responsible.
The EU’s approach reveals the same tension from a different angle. The AI Act imposes comprehensive safety requirements — good for protection, but European AI companies consistently cite regulatory burden as a competitive disadvantage. Meanwhile, China’s approach prioritizes competitiveness and speed, with safety defined primarily as political alignment rather than technical safeguards.
Every nation faces this impossible choice. And because no nation can achieve all three simultaneously, the result is a race to the bottom where the most permissive jurisdiction wins. The distillation campaigns are the mechanism by which that race operates.
The Five Conditions for Coordination (We Have Zero)
In my work across jurisdictions, I’ve identified five conditions that would need to be met for effective global AI coordination. As of today, we meet none of them.
Condition 1: Overcome competitive pressures. Nations and companies would need to accept slower development in exchange for collective safety. The distillation story shows the opposite — competitive pressure is intensifying, not easing. DeepSeek’s upcoming V4 model reportedly outperforms both Claude and ChatGPT in coding. The distillation may already have worked.
Condition 2: Values alignment. The US, EU, and China would need to agree on what “safe AI” means. But Anthropic’s own analysis shows that DeepSeek was extracting capabilities specifically to handle politically sensitive queries differently — to steer conversations away from topics China censors. Safety means fundamentally different things in different political systems.
Condition 3: Speed faster than democratic deliberation. AI capabilities advance on timescales of weeks and months. Democratic governance operates on timescales of years. The distillation campaigns adapted within 24 hours when new models were released. No governance framework on earth moves that fast.
Condition 4: Enforcement that overrides sovereignty. Even if nations agreed on rules, who enforces them? Anthropic can detect fraudulent accounts. But the proxy networks that enabled the distillation — sprawling “hydra cluster” architectures controlling 20,000+ accounts, mixing extraction traffic with legitimate requests — operate across jurisdictions where no single authority has enforcement power.
Condition 5: Agreement before understanding. We would need to agree on governance frameworks before we fully understand what we’re governing. But the technology evolves faster than our understanding of it. By the time a governance framework is negotiated, the capabilities it was designed to address have been superseded.
Zero out of five. And there’s no credible pathway to achieving even three of the five within the relevant timeline.
The Entente That Can’t Hold
There’s a deeper irony in the Anthropic story that illuminates another dimension of the coordination problem: even internal coordination within the US is fracturing.
Anthropic CEO Dario Amodei has advocated for an “entente” strategy — a coalition of democratic nations using AI to maintain decisive advantage over authoritarian competitors. He’s called for strong export controls on AI chips to China. He’s argued that DeepSeek scored “the worst” on bioweapons safety tests. He’s positioned Anthropic as the safety-first lab that also serves national security.
But the contradictions are multiplying. Anthropic holds a $200 million pilot contract with the US military. Claude is reportedly the only AI model deployed within the military’s classified systems. And Defense Secretary Pete Hegseth has summoned Amodei to the Pentagon — reportedly “not a friendly meeting” — over Anthropic’s safety restrictions on military use of Claude. The Pentagon wants fewer guardrails, not more.
So Anthropic is simultaneously arguing that Chinese labs are dangerous because they strip safety guardrails from distilled models, while the US military is pressuring Anthropic to strip safety guardrails from its own military deployment. The “entente” strategy requires allies to coordinate on safety standards. But even within the lead nation of the proposed entente, the government and the leading safety lab can’t agree on what safety means.
Meanwhile, a researcher named Yao Shunyu left Anthropic specifically because of Amodei’s anti-China stance, moving to Google DeepMind — which advocates for more cooperation with China, not less. Even within the AI research community, there’s no consensus on whether coordination or competition is the right approach.
If you can’t coordinate within a single country, between a single company and its own government, how do you coordinate globally?
The Export Control Illusion
There’s a crucial policy dimension to the distillation story that most coverage has missed.
For the past two years, US AI policy has centered on export controls — restricting China’s access to advanced AI chips like NVIDIA’s H100 and H200. The logic: if you can’t access cutting-edge compute, you can’t train frontier models. Last month, the Trump administration loosened these restrictions, allowing export of H200 chips to China. Critics called it reckless. Supporters argued that chip restrictions weren’t working anyway because Chinese labs were making rapid progress regardless.
The distillation story reveals why both sides are partially right, and why the entire framing is inadequate.
Anthropic’s blog post makes the connection explicit: “Distillation attacks require access to advanced chips. Distillation therefore reinforces the rationale for export controls: restricted chip access limits both direct model training and the scale of illicit distillation.”
But here’s the catch: distillation targets a completely different layer of competitive advantage than chips do. Export controls restrict hardware. Distillation extracts software capabilities — the reinforcement learning, the reasoning chains, the safety-trained behaviors — through nothing more than API access. You don’t need an H100 to run 16 million queries through Claude. You need a credit card and a proxy network.
As one security analyst put it, if you think about how to stay ahead in the AI race, compute is one piece. But increasingly, reinforcement learning is critical. Distillation allows you to extract those capabilities regardless of what hardware you own.
This means the entire US policy framework for AI competition — centered on chip exports — is fighting the last war. The real capability transfer is happening through API access, and no export control regime covers it. The distilled models may not be as good as the originals, but they’re close enough — and improving with each campaign.
DeepSeek’s upcoming V4 model reportedly outperforms both Claude and ChatGPT in coding. If distillation contributed to that performance — and Anthropic clearly believes it did — then the policy response has been targeting the wrong vector entirely. We’ve been locking the front door while the back door stands wide open.
This is coordination failure at the policy level, compounding the coordination failure at the geopolitical level.
Four Futures, with Honest Probabilities
Based on my analysis of governance dynamics across multiple jurisdictions, I see four possible futures for global AI coordination. I’m going to assign probabilities that I know will be uncomfortable, because intellectual honesty demands it.
50%: Fragmented governance. This is the current trajectory. Every nation develops its own approach. The US prioritizes innovation and military advantage. China prioritizes political control and competitive parity. The EU prioritizes rights and regulation. No global framework emerges. Superintelligence develops within competing national and corporate ecosystems with incompatible safety standards. The distillation campaigns continue and intensify.
30%: Hegemonic control. One nation — most likely the US or China — achieves decisive AI advantage and imposes governance on others. This could mean the US entente Amodei advocates, or it could mean Chinese AI dominance. Either way, governance reflects the values and interests of the winner, not a global consensus.
15%: Coalition governance. A coalition of democracies manages to coordinate — not perfectly, but well enough to establish meaningful standards. This requires unprecedented cooperation and would likely exclude China, creating a bifurcated AI ecosystem. Possible but historically unprecedented at the speed required.
5%: Global coordination. All major nations agree on meaningful AI governance frameworks with real enforcement mechanisms. The only truly safe outcome. And the least likely, for all the reasons the distillation story illustrates.
The most probable future — fragmented governance — is also the most dangerous for superintelligence. It means the most powerful technology ever created emerges into a world of competing standards, stolen capabilities, stripped safety guardrails, and no coordination mechanism.
The Timeline Collision
Here’s what makes all of this urgent rather than academic.
The UN General Assembly established two AI governance mechanisms in August 2025: an Independent International Scientific Panel on AI (40 experts) and a Global Dialogue on AI Governance. The first Global Dialogue is scheduled for July 2026. The second is planned for 2027.
Superintelligence, by the estimates of the people building it, arrives 2027-2028. Hassabis says 10 years happens every year. Amodei warns of “unusually painful” disruption within five years. The infrastructure is being built now.
We’ll be having our second international conversation about AI governance at approximately the same time superintelligence emerges.
The distillation campaigns reveal a world that can’t coordinate on something as basic as “don’t steal each other’s model outputs through fake accounts.” And we’re expecting this same world to coordinate on the governance of superintelligent systems?
The gap between the speed of AI development and the speed of international governance isn’t narrowing. The distillation story shows it widening — AI labs adapting within 24 hours, governance mechanisms operating on multi-year timescales.
What This Means — and What It Doesn’t
I want to be clear about what I’m not saying. I’m not saying coordination is unimportant. I’m not saying we should stop trying. And I’m not saying any particular nation is the villain.
The distillation campaigns are a symptom, not the disease. The disease is a global system that incentivizes competition over coordination, speed over safety, and national advantage over collective survival. Every player in this system — the US, China, the EU, every AI company — is responding rationally to the incentives they face. That’s what makes the problem so intractable. You can’t solve a coordination failure by asking individuals to act against their rational self-interest. You solve it by changing the incentive structure. And nobody has the authority to change global incentive structures.
What I am saying is this: we should be honest about the probability that effective global coordination will emerge in time. My assessment, based on consulting work across multiple jurisdictions and analysis of governance dynamics: approximately 5%.
That doesn’t mean despair. It means building adaptive capacity for a world where superintelligence arrives without coordinated governance. It means strengthening national and regional safety frameworks even if global ones fail. It means investing in AI safety research as if coordination won’t save us — because it probably won’t. It means companies and nations building the most robust safety infrastructure they can, independent of whether others reciprocate.
And it means asking harder questions. Not “how do we coordinate?” but “what happens when we don’t?” Not “how do we prevent the race?” but “how do we survive it?” Not “how do we stop distillation?” but “what does a world of distilled, ungoverned superintelligent systems actually look like — and how do we prepare for it?”
Anthropic’s distillation disclosure ends with a call for coordinated response. It’s the right call. But their own story proves why it probably won’t happen.
24,000 fake accounts. 16 million stolen exchanges. Three nations. Zero coordination mechanisms.
That’s not a cybersecurity story. That’s a preview of how superintelligence enters the world.
And the window Anthropic describes — the one that’s “narrow” and requires “rapid, coordinated action” — is closing faster than any institution on earth is capable of moving through it.
In your experience — across your industry, your country, your organization — have you seen any evidence that meaningful AI coordination is possible? Or are we already past the point of no return?



