Pentagon Signs Seven AI Firms, Big Tech Hits $725B Spend, GitHub Copilot Goes Per-Token

Audio

Read full brief

Your Weekly AI Press Review — Week of May 03, 2026: Agentic Surge.

This Friday, the Pentagon made its biggest AI procurement move yet — signing seven companies for military deployment while pointedly leaving Anthropic out. In this episode: Friday's top stories from the NYSE close, the most concrete deployments of the week, and signals the mainstream press missed — including a Chinese regulatory move that could reshape how global investors access the country's top AI startups. Let's get into it.

The Pentagon announced this Friday it had signed AI agreements with seven companies: SpaceX, OpenAI, Google, Nvidia, Reflection AI, Microsoft, and Amazon Web Services. The contracts authorize 'any lawful use' of their technology by the US military. The Defense Department has requested $54 billion for autonomous weapons development alone. Anthropic was conspicuously absent — the company had refused to include the lawful-use standard after a months-long feud with the Pentagon over potential AI misuse. The White House signaled Friday it is now inching toward welcoming Anthropic back, acknowledging its models are too capable to ignore.

Nebius Group, the Dutch AI data center operator, announced Friday it will acquire model optimization startup Eigen AI for $643 million in cash and stock. Nebius plans to integrate Eigen AI's technology into its Token Factory managed inference service. Eigen AI's platform replaces default model kernels with custom CUDA and Triton modules, improving speed and hardware efficiency. It also compresses model weights and enhances KV cache performance. The deal is expected to close within weeks. Nebius's Token Factory currently supports more than a dozen open-source models.

GitHub Copilot announced this Friday that starting June 1st, it will shift from a flat-rate subscription to per-token pricing. A base-tier Copilot Pro subscriber at $10 per month receives 1,000 AI Credits — each worth one US cent. Token consumption varies by model, input-output mix, and cache size. Simple queries will likely stay within budget. Multi-agent tasks on large codebases will drain credits fast. The move aligns Copilot's pricing with standard API billing and follows Anthropic's earlier move to restrict Claude Code from its most affordable plans.

Big tech's combined AI capital expenditure for 2026 has reached approximately $725 billion, according to the Financial Times — up from a $610 billion estimate in February and a 77 percent jump over last year's $410 billion. Google, Amazon, Microsoft, and Meta burned through $130 billion in Q1 alone. Google's cloud revenue grew 63 percent in its latest quarter. Both Google and Microsoft said they still lack sufficient compute capacity to meet demand. Microsoft CEO Satya Nadella flagged the industry's monetization path: per-seat pricing plus usage fees, meaning enterprise software bills are heading higher.

Qualcomm's CEO Cristiano Amon disclosed on the company's Q2 earnings call this Friday that Qualcomm has quietly entered the custom hyperscaler silicon market. It will supply a custom product to 'a leading hyperscaler' with shipments expected in the December quarter. Amon described a 'dedicated CPU for agentic experiences in the data center' and said Qualcomm is already working on high-performance AI inference accelerators. He also teased 'agentic smartphones,' citing ZTE's Doubao-integrated handset and Xiaomi's OS-level AI assistant as early examples. Qualcomm will hold an investor day in June to reveal more.

xAI released Grok 4.3 this Friday, a developer-focused model priced at $1.25 per million input tokens and $2.50 per million output tokens — roughly 40 percent cheaper on inputs and 60 percent cheaper on outputs versus its predecessor. The model runs at 100 tokens per second with a one-million-token context window. On the GDPval-AA real-world knowledge work benchmark, Grok 4.3's Elo score jumped 321 points to 1,500. A full benchmark run costs $395, compared to $3,959 for OpenAI's flagship and $4,811 for Anthropic's top model. xAI also launched Grok Imagine Agent Mode for creative production workflows.

The Musk versus Altman trial continued this Friday, with week one delivering significant disclosures. Musk testified for more than seven hours over three days in federal court in Oakland. He called himself 'a fool' for providing OpenAI $38 million in early funding that became an $800 billion company. OpenAI's lawyer confronted Musk with emails showing he had backed a for-profit structure and a Tesla takeover of OpenAI. A separate revelation: xAI appears to have distilled outputs from OpenAI's models for its own AI training — an admission that surfaced during cross-examination.

Chinese AI startups are reportedly unwinding their offshore corporate structures to register directly in China. Moonshot AI, the company behind Kimi, is in talks with lawyers about restructuring as it closes a funding round at an $18 billion valuation. StepFun has already started dissolving its foreign holding structure. The shift follows China's securities regulator signaling that companies hoping to go public should be registered domestically. Beijing blocked Meta's attempted acquisition of AI startup Manus, which triggered the regulatory warning. The restructuring process takes six to twelve months and complicates foreign capital raises.

US stock indexes closed at all-time highs this Friday, capping their best monthly performance since 2020. The S&P 500 and Nasdaq both posted record closes as megacap tech earnings — led by Meta, Microsoft, Google, and Amazon — beat expectations. AI infrastructure spending drove the upside. In contrast, the digital asset market saw about $300 million in crypto long positions liquidated in the same session. The split signals institutional confidence in AI earnings power alongside a sharp pullback in speculative leverage.

Meta's Q1 earnings call, covered in detail this Friday, revealed that LLM-based recommendation systems are already delivering measurable results. Doubling the length of user interaction sequences used for training on Instagram drove a 10 percent lift in Reels time spent. Facebook's global video time increased more than 8 percent — the largest gain in four years. Same-day posts now account for more than 30 percent of recommended Reels, more than double the level a year ago. Over half a billion users on Facebook and Instagram now watch AI-generated videos each week.

Planet Labs achieved a milestone this Friday: running AI image processing aboard its Pelican-4 satellite in orbit. The satellite identified more than a dozen aircraft on the tarmac at Alice Springs Airport in Australia, each highlighted in real time by an on-board AI model. Planet Labs' engineers spent 18 months achieving reliable autonomous object classification from space. The company's constellation generates 30 terabytes of data per day. On-board processing eliminates the 6-to-12-hour delay between image capture and ground analysis — critical for wildfire detection and military surveillance.

Apple published a research paper this Friday on inference-time feedback for tool-calling agents, accepted at ACL 2026. The paper introduces a 'Reinforced Agent' architecture where a secondary reviewer agent evaluates tool calls before execution. On multi-turn stateful tasks, the approach improved accuracy by 7.1 percent. On irrelevance detection, gains reached 5.5 percent. The key finding: reviewer model choice is critical. The reasoning model used in the study achieved a 3-to-1 benefit-to-risk ratio versus 2.1-to-1 for a standard model. Automated prompt optimization added a further 1.5 to 2.8 percent.

Australia's financial regulator APRA issued a formal warning this Friday to banks and superannuation trustees about AI governance gaps. APRA conducted a targeted review of large regulated entities in late 2025 and found AI in use across all of them — but maturity in risk management varied sharply. The regulator flagged gaps in model behavior monitoring, change management, and decommissioning. It specifically called out the failure to adjust identity and access management for non-human AI agents. APRA also warned that AI adoption is expanding attack pathways, including prompt injection and insecure integrations.

Microsoft launched a Legal Agent inside Word this Friday, specifically designed for legal teams. The agent handles contract review clause by clause against a playbook, tracks negotiation history, and flags risks and obligations. It follows structured workflows shaped by real legal practice rather than general AI model interpretation. The product comes from engineers Microsoft hired from Robin AI, a failed AI contract review startup. Legal Agent is rolling out to Frontier program members in the US. The launch coincides with a broader industry signal: AI is eliminating entry-level legal work faster than law schools can adapt.

The ARC Prize Foundation published analysis this Friday of 160 game runs by OpenAI's latest and Anthropic's latest flagship model on the ARC-AGI-3 benchmark. Both models scored below 1 percent on tasks humans solve without prior knowledge. The leading model hit 0.43 percent at a cost of around $10,000 per run. The analysis identified three systematic error patterns: models correctly detect local effects but cannot build working world models; they form hypotheses but reject correct ones; and they get stuck on wrong hypotheses despite contradicting evidence. Humans solved the same tasks with no prior knowledge.

US major stock indexes closed at record highs Friday, with the Nasdaq Composite posting its best monthly gain since 2020. The rally was driven by AI-linked megacap earnings beats from Meta, Alphabet, Amazon, and Microsoft. Nvidia, which supplies the GPU infrastructure underpinning all four companies' AI buildouts, benefited from the sentiment. The combined $725 billion capex commitment from the four hyperscalers signals sustained demand for Nvidia's hardware through at least 2027. Memory chip makers Samsung and SK Hynix separately warned of a record supply squeeze, with Samsung customers already pre-booking capacity for 2027.

The Anthropic-Pentagon standoff reached a new phase this Friday. Axios reported the White House is actively working to bring Anthropic back into the government fold after months of legal battles. The Trump administration's AI acceleration strategy requires the most capable frontier models — and Anthropic's latest is among them. Anthropic had sued the Pentagon after being labeled a supply chain risk. The company refused the 'any lawful use' standard that the seven other firms accepted Friday. The White House now faces the tension of needing a company it has been fighting in court.

On deployments. Sun Finance, a Latvian fintech processing a new loan request every 0.63 seconds, rebuilt its identity verification pipeline using Amazon Bedrock, Amazon Textract, and Amazon Rekognition. The results are stark: extraction accuracy improved from 79.7 percent to 90.8 percent, per-document costs fell 91 percent, and processing time dropped from up to 20 hours to under 5 seconds. The company handles 80,000 monthly microloan applications across nine countries. Before the rebuild, roughly 60 percent of applications required manual operator review. The solution went live in production 35 business days after technical handover.

Meta's Q1 deployment results confirm that LLM-based content ranking is now a revenue driver, not a pilot. Doubling training sequence length on Instagram produced a 10 percent lift in Reels time spent. Facebook video time rose more than 8 percent — the largest gain in four years. Same-day posts now represent more than 30 percent of recommended Reels, more than double the prior year. Meta CFO Susan Li said the company is scaling model size and complexity and incorporating LLMs to deepen content understanding. The company is also validating LLM-based recommender architectures before broader rollout in future years.

Planet Labs deployed an AI model aboard its Pelican-4 satellite in orbit, achieving real-time object classification from space for the first time after 18 months of engineering work. The satellite identified aircraft at Alice Springs Airport in Australia without ground-side processing. Planet Labs' constellation generates 30 terabytes of data daily across several hundred satellites. Eliminating the 6-to-12-hour ground processing delay enables real-time wildfire detection, military surveillance, and autonomous satellite tasking. The company is building a fleet of 32 Pelican satellites imaging Earth's surface at 30-centimeter resolution.

Microsoft's Legal Agent in Word is now rolling out to Frontier program members in the US. The agent reviews contracts clause by clause against a defined playbook, analyzes tracked-change documents, and flags risks and obligations. It follows structured legal workflows rather than general model interpretation — a design choice that directly addresses the trust gap that has slowed AI adoption in law firms. The product draws on engineering from Robin AI, which Microsoft acquired talent from after the startup failed. The deployment targets a sector where AI is already eliminating entry-level associate work at measurable scale.

Popsa, a photo book platform available in more than 50 countries and 12 languages, deployed Amazon Nova models via Amazon Bedrock to generate personalized photo book titles. The system combines metadata, computer vision, and retrieval-augmented generation. In 2025, it generated over 5.5 million personalized titles. The deployment improved title quality, reduced cost, and cut response times versus the prior system. Popsa used Amazon Nova Lite and Pro alongside Anthropic's Claude Haiku through a unified Bedrock API. The result was measurable uplifts in customer engagement and purchase rates.

Choco, a food distribution platform, processed 8.8 million orders annually using OpenAI APIs, handling over 200 billion AI tokens in production. The deployment cut manual order entry by 50 percent and doubled sales team productivity without adding headcount. Choco operates globally across the food and beverage sector, connecting restaurants with suppliers. The system handles always-on operations across time zones, processing orders from multiple input formats including voice, email, and messaging apps. The case demonstrates that agentic order management at scale is now commercially viable in logistics.

Apple's Reinforced Agent research, published Friday and accepted at ACL 2026, demonstrated a concrete deployment architecture for enterprise tool-calling agents. A secondary reviewer agent evaluates tool calls before execution, shifting from post-hoc error recovery to proactive error prevention. On multi-turn stateful tasks, accuracy improved 7.1 percent. On irrelevance detection, gains reached 5.5 percent. The architecture introduces a clean separation between execution and review agents — each improvable independently. Automated prompt optimization via GEPA added a further 1.5 to 2.8 percent on top of model selection gains.

Off the radar. China's National Development and Reform Commission blocked foreign investment in the Manus AI agent project earlier this week — a move that received limited English-language coverage but carries significant implications for cross-border AI M&A. Beijing ruled that Manus's core algorithms fall under restricted export technologies, requiring compliance with technology export licensing and data security assessment procedures. The NDRC found that despite Manus's parent company relocating its registered headquarters to Singapore, its China-based entities remained active and legally unseparated from the technology. This is the first time China has explicitly invoked its Foreign Investment Security Review framework to block an AI acquisition — setting a precedent that will affect every foreign investor eyeing Chinese AI assets.

Chinese AI startups are restructuring their corporate architecture in ways that will materially affect foreign investors' access. Moonshot AI, StepFun, and DeepRoute.ai are among the companies reportedly dissolving offshore Cayman Islands holding structures to register directly in China. The process takes six to twelve months and complicates future foreign capital raises. China's securities regulator has signaled that offshore-registered companies face tougher IPO approval. For institutional investors with positions in Chinese AI via offshore vehicles, this restructuring wave creates both valuation uncertainty and potential liquidity constraints. Moonshot AI is closing a round at an $18 billion valuation mid-restructuring.

DAIMON Robotics, a two-and-a-half-year-old Hong Kong company, released Daimon-Infinity this week — described as the largest omni-modal robotic dataset for physical AI, featuring high-resolution tactile sensing. The dataset spans tasks from laundry folding to factory assembly lines and includes 10,000 hours of open-sourced data. DAIMON's fingertip-sized tactile sensor packs over 110,000 effective sensing units. The company's co-founder, Professor Michael Yu Wang, has pioneered Vision-Tactile-Language-Action architecture — elevating touch to a modality on par with vision. Partners include Google DeepMind, Northwestern University, and the National University of Singapore. This is the kind of physical AI infrastructure work that rarely surfaces in Western tech media but is foundational to the next generation of industrial robotics.

North Korean threat actor Famous Chollima — also known as Shifty Corsair — ran a supply chain attack this week that exploited AI-generated code. Researchers at ReversingLabs found malicious code in an npm package that was introduced via a commit co-authored by an LLM. The package targeted a Solana-based autonomous trading agent and gave attackers access to users' crypto wallets. The attack used a phased approach: first-layer packages appeared clean, while second-layer packages embedded the malicious functionality. The campaign, codenamed PromptMink, demonstrates that AI coding tools are now an active attack surface — not just a productivity tool. Enterprise security teams have not yet updated their software supply chain controls to account for LLM-assisted commits.

Australia's APRA issued a formal supervisory warning to financial institutions this week that has received almost no coverage outside specialist regulatory circles. The regulator found that banks and superannuation trustees are deploying AI in loan processing, claims triage, fraud detection, and customer interaction — but identity and access management practices have not been updated to account for non-human AI agents. APRA specifically flagged prompt injection and insecure integrations as new attack pathways. It also found that some boards are relying on vendor presentations rather than independent scrutiny. For financial institutions in any jurisdiction, this is a preview of the regulatory posture that other prudential supervisors are likely to adopt within 12 to 18 months.

London is quietly consolidating its position as Europe's leading AI hub, according to reporting in L'Usine Digitale this Friday — a development largely absent from English-language tech media. US hyperscalers including Google, Microsoft, and Amazon have made significant infrastructure and talent commitments in London over the past 12 months. The city's combination of English-language talent, proximity to European regulatory bodies, and post-Brexit financial flexibility is attracting AI lab expansions that might otherwise have gone to Paris or Berlin. For European enterprise technology buyers and investors, London's emergence as the dominant AI cluster on the continent has direct implications for where talent, compute, and regulatory influence will concentrate over the next three to five years.

Looking ahead to next week. The Musk versus Altman trial resumes in federal court in Oakland. Week one produced major disclosures — including the admission that xAI distilled outputs from OpenAI's models, and Musk's acknowledgment that he backed a for-profit OpenAI structure in his own emails. Week two is expected to feature testimony from Sam Altman and OpenAI president Greg Brockman. The outcome could affect OpenAI's $800 billion valuation and its ongoing conversion from nonprofit to for-profit — a structural change that underpins its ability to raise further capital and execute the AWS partnership announced this week.

Watch for any formal announcement on Anthropic's Pentagon relationship. The White House signaled this Friday it is actively working to bring Anthropic back into the government AI fold. Anthropic was the only major frontier lab excluded from Friday's seven-company military agreement. Its latest flagship model is among the most capable available — and the Defense Department's $54 billion autonomous weapons budget creates strong institutional pressure to resolve the standoff. A formal agreement or executive order could move Anthropic's valuation and reshape the competitive dynamics of the government AI market.

GitHub Copilot's per-token pricing takes effect June 1st. Next week, enterprise technology buyers will begin modeling the cost impact. A base-tier subscriber at $10 per month receives 1,000 AI Credits. Multi-agent tasks on large codebases will exhaust that budget quickly. Expect enterprise procurement teams to accelerate evaluations of local model alternatives — Alibaba's Qwen family and other open-weight models are already being positioned as cost-effective substitutes. The pricing shift also sets a precedent: Microsoft's move signals that the flat-rate AI subscription era for developer tools is ending.

Qualcomm's investor day is scheduled for June, but the company's Q2 earnings call this Friday pre-announced two major product categories: a custom CPU for agentic data center workloads and high-performance AI inference accelerators for an unnamed hyperscaler. Shipments are expected in the December quarter. Watch for any leak or early disclosure of the hyperscaler identity next week — the market will price in the contract size immediately. Qualcomm's stock reaction to the earnings call sets the baseline. The company also flagged a coming memory supply constraint for agentic smartphones, which could affect SK Hynix and Samsung valuations.

Tech Week Shanghai debuts May 6 and 7 at Kerry Hotel Shanghai — the founding edition of what organizers plan to scale into a full flagship event in 2027. Confirmed exhibitors include China Telecom, China Mobile, China Unicom, Siemens, Honeywell, and the Shanghai Foundation Model Innovation Center. The event is the first structured attempt to connect global enterprise technology providers with China's data ecosystem since Beijing tightened its grip on AI exports and foreign investment. Given the Manus acquisition block and the offshore restructuring wave among Chinese AI startups, the conversations at this event will be a real-time read on how foreign technology firms are navigating China's tightening AI regulatory environment.

This podcast has a daily production cost. If you enjoy it, support it — the link is on the podcast page. Thank you.