Anthropic's 80x Compute Surge, OpenAI's MRC Protocol, and DeepSeek's $45B Valuation

Audio

Read full brief

Your Weekly AI Press Review — Week of May 08, 2026: Compute War.

This Friday, the biggest story is Anthropic's emergency seizure of Elon Musk's Colossus-1 supercomputer — 220,000 GPUs, 300 megawatts — after 80x demand growth blew past its own infrastructure. We've got that, OpenAI's new networking protocol, DeepSeek's $45 billion valuation bid, and OpenAI's real-time voice models. Plus, off the radar: a Chinese startup rebuilding the entire server architecture around GPUs instead of CPUs, and Korea's government data-recycling play.

Anthropic is taking over the entire computing capacity of SpaceX's Colossus-1 data center. That's more than 220,000 Nvidia GPUs and over 300 megawatts of power. The deal is expected to come online within a month. Behind the move: Anthropic's compute demand grew roughly 80x, blowing past its own contracted infrastructure. The company is also doubling rate limits for Claude Code and significantly raising API limits for its most capable models. The Musk angle is striking — Anthropic and xAI have been fierce rivals, but compute scarcity overrides ideology.

DeepSeek is reportedly seeking its first external funding round at a valuation of about $45 billion. China's state-backed semiconductor vehicle — known as the Big Fund — is in talks to lead the round. Tencent is also among investors still in discussions. The final lineup has not been decided. If it closes at that figure, DeepSeek would rank among the most valuable private AI companies on Earth. The Big Fund's involvement signals Beijing is treating DeepSeek as strategic national infrastructure, not merely a commercial startup.

OpenAI shipped three new voice models to its API this Friday. The flagship, GPT-Realtime-2, brings reasoning that OpenAI says matches its most capable text model. A second model, GPT-Realtime-Translate, handles real-time translation across more than 70 languages. A third, GPT-Realtime-Whisper, handles live transcription. The release targets enterprise voice applications — call centers, real-time interpretation, and voice-driven agents — where latency and reasoning quality have historically forced a trade-off. OpenAI is now claiming both.

OpenAI, AMD, Broadcom, Intel, Microsoft, and Nvidia jointly released MRC — Multipath Reliable Connection — an open-source networking protocol for large-scale AI supercomputers. MRC spreads data packets across hundreds of simultaneous paths between GPUs. It recovers from network failures in microseconds. Critically, it allows clusters of more than 100,000 GPUs to be built using only two tiers of Ethernet switches, versus the three or four tiers previously required. The protocol is already running on OpenAI's Stargate supercomputer. The open-source release is a direct challenge to proprietary networking vendors.

Roche agreed to pay up to $1 billion to acquire PathAI, a Boston-based startup specializing in AI-powered pathology diagnostics. The deal gives Roche an AI layer for tissue analysis across oncology workflows. PathAI's models are trained on digitized pathology slides and have been validated in clinical settings. For Roche, the acquisition accelerates its push to embed AI into diagnostic hardware and lab services — a market where speed and accuracy of cancer detection directly affect treatment outcomes and reimbursement rates.

Moonshot AI, the Beijing-based developer of the Kimi assistant, confirmed a roughly $2 billion funding round that values the company at more than $20 billion. The round was led by Long-Z Investments, Meituan's venture arm, with China Mobile also participating. Over the past six months, Moonshot has raised a total of about $3.9 billion. The company is simultaneously navigating Beijing's new listing rules for overseas-registered firms, which complicates a potential IPO. Kimi competes directly with Baidu's Ernie and Alibaba's Qwen in China's fast-consolidating consumer AI market.

Anthropic co-founder Jack Clark published a prediction this Friday: there is a greater than 60% chance that an AI model will fully train its successor by the end of 2028. Clark made the statement alongside the launch of the Anthropic Institute, a new research body focused on recursive self-improvement risks. Anthropic says it is already seeing early signs of AI contributing to its own research and development cycle. The institute's agenda was shared exclusively with Axios before publication. For enterprise risk officers, this is the clearest public signal yet from a frontier lab that the pace of capability gain is accelerating beyond linear.

DeepL, the German AI translation company, is cutting roughly 250 employees — about one-fifth of its workforce — to rebuild as what it calls an AI-native organization. DeepL competes with Google Translate and Microsoft Translator. The layoffs follow a pattern seen across AI-adjacent software companies: the underlying model capabilities have advanced faster than the human workflows built around them. DeepL raised about $300 million at a $2 billion valuation in 2023. No updated valuation was disclosed alongside the restructuring announcement.

Rackspace Technology soared after announcing a partnership with AMD focused on AI infrastructure services. The deal positions Rackspace to offer AMD Instinct GPU-based cloud capacity to enterprise customers. Rackspace has been repositioning away from legacy managed hosting toward AI-specific infrastructure. The AMD partnership gives it a differentiated supply of compute at a moment when Nvidia GPU availability remains constrained for many mid-market buyers. No financial terms were disclosed, but the stock reaction reflected investor appetite for any credible AI infrastructure play outside the hyperscaler tier.

The US and China are exploring official government-to-government talks on artificial intelligence, according to the Wall Street Journal. No formal framework has been agreed. The discussions are preliminary. The backdrop is significant: both countries have been building parallel AI ecosystems with minimal technical coordination, and the risk of miscalculation — particularly around military AI applications — has been rising. Any formal channel would be the first structured bilateral AI dialogue since the Biden-era safety commitments, which carried no enforcement mechanism.

WiseTech Global, the Australian logistics software company, is still processing roughly 2,000 planned job cuts announced in February. This Friday, workers told The Guardian they have been waiting nearly three months to learn which roles will be eliminated. WiseTech's founder told investors this week that an AI agent can learn a human's job in about 15 minutes. The company's AI-first pivot is one of the most aggressive in enterprise software outside the US. The prolonged uncertainty is generating significant internal friction and reputational risk with remaining staff.

Anthropic's Claude was used by Mozilla to harden Firefox against security vulnerabilities. Mozilla had access to a preview of Anthropic's most capable unreleased model. The result: hundreds of real vulnerabilities found and fixed. Mozilla's security team described a dramatic shift in AI-generated bug report quality over just a few months — from low-signal noise to high-precision, actionable findings. The team also improved its own techniques for steering and stacking AI models to filter false positives. This is one of the most concrete published accounts of frontier AI delivering measurable security engineering output.

Anthropic unveiled a Financial Services Solution at a New York event co-hosted with JPMorgan Chase CEO Jamie Dimon. The product bundles Claude-based AI agents tailored to financial workflows — research synthesis, regulatory document review, and client communication drafting. Dimon endorsed the broader AI capex wave, calling the technology worth a trillion-dollar investment. The event signals Anthropic's push into regulated enterprise verticals where data sensitivity and compliance requirements have slowed AI adoption. JPMorgan's public endorsement carries weight with other large financial institutions evaluating vendor selection.

ServiceNow, SAP, and Workday are each moving toward consumption-based pricing for AI agent usage, according to reporting published this Friday. The shift means enterprise customers will pay per agent action or per workflow completion, rather than per seat. This repricing model has significant implications for IT budget planning: costs become variable and tied directly to AI utilization rates. For CFOs, it introduces a new category of operational expenditure that scales with automation depth — and that is difficult to forecast without usage data from early deployments.

CISA, the US Cybersecurity and Infrastructure Security Agency, published a guidance document this Friday on what it calls careful agentic AI adoption. The document targets federal agencies and critical infrastructure operators. It flags risks specific to autonomous AI agents: unintended action chains, insufficient human oversight, and prompt injection attacks that redirect agent behavior. CISA recommends staged deployment, mandatory human-in-the-loop checkpoints for high-consequence actions, and red-teaming before production rollout. It is the first formal US government guidance document specifically addressing agentic AI risk.

Haun Ventures closed $1 billion across two new funds. One targets stablecoin infrastructure. The second targets what the firm calls AI agent plumbing — the middleware, identity, and payment rails that autonomous AI agents will need to transact on behalf of users. The dual focus reflects a thesis that AI agents and crypto infrastructure are converging: agents need programmable money to act autonomously in economic contexts. Haun's portfolio already includes several companies at this intersection. The fund size signals institutional LP appetite for the AI-crypto convergence thesis.

Zyphra released ZAYA1-8B, a reasoning Mixture of Experts model with only about 760 million active parameters. Despite its small active footprint, the model outperforms several open-weight models many times its size on math and coding benchmarks. It was trained end-to-end on AMD Instinct MI300 hardware and released under the Apache 2.0 license. Zyphra claims it surpasses Claude's mid-tier model on the HMMT 2025 math competition benchmark using a novel test-time compute method called Markovian RSA. The result adds to a growing body of evidence that intelligence density — not raw parameter count — is the key efficiency frontier.

Aon published a report this Friday finding that enterprise AI plans are outpacing workforce investment. The firm surveyed large employers across financial services, healthcare, and manufacturing. The gap is structural: companies are committing capital to AI tooling and infrastructure faster than they are training employees to use it effectively. Aon flagged this as a material risk to AI ROI — not a technology failure, but an adoption failure driven by under-investment in change management and skills development. The finding aligns with separate Singapore data showing 95% of SME decision-makers say they need more AI training.

On deployments. Mozilla used Anthropic's most capable unreleased model to locate and fix hundreds of security vulnerabilities in Firefox. The project ran over several months. Mozilla's team described the shift from AI-generated noise to high-precision bug reports as dramatic — occurring over just a few months of model improvement. The team developed techniques to stack and steer multiple AI models, filtering false positives before human review. The result is a concrete, published benchmark for what frontier AI can deliver in a security engineering context at production scale.

Parloa, a German enterprise voice AI company, deployed OpenAI models to power real-time voice agents for large-scale customer service operations. The platform lets enterprises design, simulate, and deploy voice interactions without building custom speech pipelines. Parloa's agents handle live conversations, route complex queries, and escalate to human agents based on confidence thresholds. OpenAI featured the deployment on its website this Friday as a reference case for its Realtime API. The timing aligns with OpenAI's release of three new voice models, suggesting Parloa is among the first enterprise partners to access the upgraded stack.

Tomofun, the Taiwan-based pet-tech company behind the Furbo Pet Camera, deployed vision-language models on AWS Inferentia2 instances for real-time pet behavior detection. The move to Inferentia2 — Amazon's purpose-built AI chip — reduced inference costs compared to GPU-based alternatives while maintaining detection accuracy. The deployment runs continuously on live video streams from Furbo cameras in homes across multiple countries. AWS published the case study this Friday as a reference architecture for cost-sensitive, always-on vision AI workloads outside the data center.

Hyundai Card, the South Korean financial services company, began testing generative AI for PR writing and is building an internal LLM trained on proprietary company data. The internal model is designed to handle brand-specific language, regulatory tone requirements, and Korean-language nuance that general-purpose models handle inconsistently. Hyundai Card joins a growing list of Asian financial institutions — including several Japanese megabanks — building internal LLMs rather than relying solely on external API providers. The driver is data sovereignty and the ability to fine-tune on proprietary customer and product data.

Amazon Web Services published a reference architecture this Friday for deploying vision-language models on Inferentia2 for manufacturing quality control. The architecture uses 3D point cloud anomaly detection — identifying defects in physical objects from depth-sensor data. A separate research paper published the same day showed a consistency-model approach achieving up to 80 times faster inference than prior state-of-the-art methods for the same task, without GPU acceleration. Together, the two publications signal that 3D anomaly detection is moving from research prototype to production-viable deployment on cost-constrained edge hardware.

Anthropic's Financial Services Solution, unveiled at the JPMorgan event this Friday, bundles ten Claude-based AI agents targeting specific finance workflows: regulatory document review, earnings call summarization, client communication drafting, and compliance flagging. The product is designed for deployment inside existing financial institution infrastructure, with data isolation controls intended to meet SEC and FINRA requirements. Anthropic did not disclose pricing or named enterprise customers beyond JPMorgan's public endorsement. The solution competes directly with Bloomberg's AI layer and Microsoft's Copilot for Finance.

Yellow.ai launched Nexus Vox, an enterprise voice AI platform that clones a brand's voice and deploys it across more than 500 languages in under one second of latency. The system is designed for contact center replacement at scale. Yellow.ai claims Nexus Vox can handle full customer service conversations — not just routing — in real time, with brand-consistent tone across languages. The sub-second latency claim is the key differentiator: prior multilingual voice AI systems introduced perceptible delays that degraded customer experience scores. No independent latency benchmarks were published alongside the launch.

Off the radar. A Beijing-based startup called Ronxin Zhiyuan — founded by a Tsinghua University electrical engineering graduate — closed an angel round of several hundred million yuan this Friday, led by Beijing's Green Energy and Low-Carbon Industry Fund and Saif Investment Fund. The company is building what it calls an AGC architecture: AI Computer Systems with the GPU as Core. The design inverts the traditional server stack, making the GPU the primary compute unit and demoting the CPU to a peripheral controller. The GPU-to-CPU ratio shifts from roughly 2-to-1 in conventional servers to as high as 32-to-1. A single operating system manages up to 64 GPUs in a unified address space, eliminating cross-node data copying. The company's custom AI BMC system cuts hardware monitoring response time from the standard 3-to-5 second polling cycle down to microsecond-level reaction — enabling real-time thermal management and fault recovery. When a single GPU fails, the system reroutes workloads to redundant GPUs without stopping the job. This is a full-stack architectural bet, not a point optimization, and it is happening entirely outside the Nvidia ecosystem.

South Korea's government announced a plan to upcycle existing AI training datasets for the generative AI era, according to the Seoul Economic Daily. The initiative, coordinated through the Ministry of Science and ICT, involves re-annotating and restructuring datasets originally built for narrow supervised learning tasks — image classification, speech recognition, named entity recognition — into formats suitable for instruction-tuning and RLHF pipelines. Korea has invested heavily in public AI datasets over the past decade. Rather than building from scratch, the government is treating that corpus as a reusable national asset. The move is cost-efficient and strategically significant: it gives Korean AI labs a head start on Korean-language fine-tuning data that foreign labs cannot easily replicate.

China's AI video generation market is producing revenue numbers that have not reached Western financial media. According to 36kr reporting published this Friday, the platform Creati surpassed 10 million global users within one year of launch and reached an annual recurring revenue of about $20 million. A separate industry source cited compute costs for a single short-drama production at roughly 30,000 yuan — about $4,000 — with leading tool platforms processing more than 100 such productions per month. Alibaba's HappyHorse video model launched gray-testing in late April at a list price of 0.9 yuan per second of 720p video. ByteDance's Seedance and Kuaishou's Kling are iterating on weekly minor releases and bimonthly major releases. The gross margins are thin and the competitive moat is shallow, but the revenue velocity is real.

Xiaohongshu — known internationally as RedNote — announced the formation of a new top-level AI division called Dots on April 30th, according to 36kr. The division was elevated from an internal lab called Hi Lab and now reports directly to the company's new president. Dots encompasses model research, infrastructure, engineering, and product — including the company's primary AI application, Dian Dian. The timing reflects urgency: Xiaohongshu has been cautious about AI integration to protect its community authenticity, but the agent narrative has accelerated internal pressure to move. With roughly 300 million monthly active users and a content corpus of real human lifestyle experience, Xiaohongshu holds training data that general-purpose models lack. The Dots formation is the company's first public signal that it intends to compete at the model layer, not just the application layer.

Apple's research lab published SFI-Bench this week — a video-based benchmark with more than 1,700 questions derived from egocentric indoor video scans. The benchmark tests what Apple calls spatial-functional intelligence: not just where objects are in a room, but what they are for and how they relate to human tasks. Existing spatial benchmarks like VSI-Bench test geometric perception. SFI-Bench tests higher-order reasoning about object affordance and function. Apple published this quietly, without a press release, through its machine learning research portal. For anyone tracking Apple's robotics and Vision Pro roadmap, this benchmark signals the specific capability gap Apple is trying to close — and the evaluation framework it will use to measure progress.

Looking ahead to next week. DeepSeek's reported $45 billion first funding round is the most consequential pending event in AI capital markets. China's Big Fund is in talks to lead. Tencent is among other investors still in discussions. If the round closes at that valuation, it would make DeepSeek one of the three most valuable private AI companies globally — alongside OpenAI and Anthropic. Watch for confirmation of the final investor lineup and any attached governance conditions. Beijing's involvement as lead investor would formalize DeepSeek's status as a state-strategic asset, with implications for export control negotiations and the nascent US-China AI talks.

The US-China AI dialogue reported by the Wall Street Journal this Friday has no confirmed date or format. But the signal is live. Watch for any State Department or Commerce Department confirmation next week. The context matters: DeepSeek's fundraising, China's 10,000-GPU cluster buildout, and the Huawei-optimized DeepSeek V4 deployment are all accelerating simultaneously. Any formal bilateral channel — even a preliminary working group — would be the first structured AI safety dialogue between the two countries since informal commitments made in 2023 that carried no enforcement mechanism.

Anthropic's Colossus-1 deal is expected to come online within a month. Next week, watch for any update on the timeline and on whether Anthropic raises its API rate limits as promised. The company said it would double Claude Code limits and significantly raise Opus API limits. Enterprise developers who have been hitting capacity ceilings on agentic coding workflows will be watching closely. Any slip in the Colossus-1 activation timeline would be a material signal about Anthropic's ability to meet demand — and would likely accelerate customer conversations with competing providers.

ServiceNow reports earnings next week. Consensus estimates are not yet finalized, but the company's shift toward consumption-based AI agent pricing — reported this Friday — will be a central topic on the call. Analysts will press management on the revenue recognition implications of per-action pricing versus per-seat contracts, and on whether early enterprise customers are showing measurable AI utilization rates. SAP and Workday are making the same pricing pivot. ServiceNow's call will be the first public data point on whether the new model is accelerating or complicating enterprise AI budget commitments.

South Korea's Kospi broke the 7,000 level this week on AI chip momentum, with Samsung crossing the $1 trillion market cap threshold. That rally was driven by AI memory demand — specifically HBM chips used in Nvidia GPU clusters. Next week, watch whether the Kospi holds above 7,000 as Monday opens in Seoul. Any softening in Nvidia's forward guidance — or any signal from the US-China trade talks affecting chip export rules — could reverse the Samsung-led rally quickly. Asian market open on Monday will set the tone for semiconductor stocks in New York before the US session begins.

This podcast has a daily production cost. If you enjoy it, support it — the link is on the podcast page. Thank you.