DeepSeek V4 Drops, OpenAI Doubles Down, and the Programmer Job Market Cracks

Audio

Read full brief

Your Weekly AI Press Review — Week of April 25, 2026: Agentic Surge.

This Friday brought two seismic model drops — DeepSeek V4 and OpenAI's latest — plus a Federal Reserve study showing AI has already reshaped the programmer job market. In Part 1, we cover Friday's biggest moves in models, infrastructure, and enterprise deals. Part 1b surfaces signals the Bloomberg terminal missed, including a Chinese food-delivery giant training a trillion-parameter model entirely on domestic chips. Part 2 sets up your Monday with the regulatory and earnings events that matter.

DeepSeek released V4 on Friday — its first major architecture update since V3.2 last December. The flagship V4-Pro carries 1.6 trillion total parameters, with 49 billion active per token. The smaller V4-Flash runs 284 billion total, 13 billion active. Both support one-million-token context windows. Both are MIT-licensed and available on Hugging Face. V4-Pro is now the largest open-weight model available — bigger than Kimi K2.6 at 1.1 trillion and more than double DeepSeek's own V3.2. API pricing for V4-Pro comes in at about $1.74 per million input tokens. That's a fraction of comparable closed models.

The architectural story behind V4 is as important as the parameter count. DeepSeek introduced two new attention mechanisms — Compressed Sparse Attention and Heavily Compressed Attention — that together slash compute requirements dramatically. At a one-million-token context, V4-Pro uses only 27 percent of the FLOPs and 10 percent of the KV cache memory of V3.2. That efficiency is what makes the pricing viable. The model was trained on 33 trillion tokens using FP4 quantization. DeepSeek also released both Base and Instruct variants — a rare move that sets the stage for a potential reasoning successor.

The geopolitical dimension of V4 is explicit. DeepSeek built native compatibility with Huawei's Ascend NPU family — a direct response to US export controls on NVIDIA hardware. Huawei Ascend supply is still roughly a quarter of H100 availability, but the compatibility milestone is real. Chinese sources at 36kr report that V4's delayed release was partly caused by a serious training failure in mid-2025 during the migration from NVIDIA to Ascend infrastructure. The model remains text-only — multimodal training was deferred due to compute and cash constraints.

OpenAI's latest model landed this week and is now available through the API with a one-million-token context window. It targets agentic workflows — planning, tool use, multi-step task completion, and self-correction. Independent benchmarking by Artificial Analysis puts it at the overall top spot, narrowly ahead of Anthropic's and Google's latest, though it shows a notable weakness on hallucination metrics. Effective API costs run about 20 percent higher than the prior generation — the doubled token prices on paper are partially offset by lower token usage per task. It's available to ChatGPT Plus, Pro, Business, and Enterprise subscribers.

NVIDIA confirmed this Friday that over 10,000 of its own employees are actively using OpenAI's latest model through the Codex platform. The company says debugging cycles that previously stretched days are closing in hours. Experimentation that took weeks is turning into overnight progress on complex, multi-file codebases. NVIDIA is running Codex on its own GB200 NVL72 rack-scale systems — hardware it says delivers 35 times lower cost per million tokens and 50 times higher token output per second per megawatt versus the prior generation. Jensen Huang sent a company-wide email urging adoption.

Meta signed a deal this Friday making it one of the world's largest customers of AWS Graviton cores — purpose-built CPU chips for agentic AI workloads. The agreement brings tens of millions of Graviton cores into Meta's compute portfolio. Meta's head of infrastructure cited the need to diversify compute sources for CPU-intensive agentic workloads. The announcement came one day after reports that Meta plans to cut about 8,000 employees — roughly 10 percent of its workforce — while leaving 6,000 open roles unfilled. Meta's projected AI infrastructure spend this year sits between $115 billion and $135 billion.

The US Justice Department intervened this Friday in a lawsuit by Elon Musk's xAI challenging Colorado's AI regulation law. The DOJ argued the Colorado statute — which requires developers to guard against unintended discriminatory effects — violates the 14th Amendment. The intervention escalates what was a single-company legal challenge into a direct federal-versus-state confrontation. Colorado's law is scheduled to take effect June 30. The Trump administration has been pushing for a single federal AI framework, and this filing is its most aggressive move yet on state-level AI regulation.

The UK government quietly revised its estimate of AI data center carbon emissions — upward by a factor of more than 100. New figures published this Friday put potential emissions from AI infrastructure at between 34 million and 123 million tonnes of CO2 over the next decade. The previous estimate, since deleted, had projected a maximum of 0.142 million tonnes in a single year. The revision appeared in an update to the UK's compute roadmap. The Department for Science, Innovation and Technology says the range depends heavily on model efficiency gains and grid decarbonization pace.

A Federal Reserve Board study published this Friday found that US programmer job growth has nearly halved since ChatGPT launched in late 2022. Before that date, programming-heavy occupations were growing at just under 5 percent annually — well above the overall labor market. Since then, growth has essentially flatlined in IT services and software development. The researchers built a counterfactual model to strip out broader tech sector pressures from rate hikes and the post-COVID correction. Even after that adjustment, programmer employment is falling by about 3 percentage points per year. Stretched over 3 years, the gap represents roughly 500,000 jobs that likely would have existed without LLMs.

Alibaba released Qwen3.6-27B this Friday — a dense 27-billion-parameter open-source model that outperforms its 397-billion-parameter predecessor on nearly every coding benchmark tested. It scored 77.2 on SWE-bench Verified versus 76.2 for the larger model, and 59.3 on Terminal-Bench 2.0 versus 52.5. The model handles both text and multimodal reasoning. As a dense architecture, it's simpler to deploy than mixture-of-experts alternatives. It's available through Alibaba Cloud, Qwen Studio, and as open weights on Hugging Face and ModelScope.

The IRS is running 126 active AI applications as of this Friday — up from just 10 in August 2022. The expansion covers audit selection, fraud detection, taxpayer services, and operational workflows. About 61 percent of those applications are still in development. The agency uses machine learning to score millions of returns simultaneously for noncompliance risk. Its criminal investigation unit uses Palantir-built tools to process suspicious activity reports. Revenue agents now have access to generative AI for drafting audit documents, with AI producing a first draft that the agent reviews and finalizes.

Trump administration AI deadlines set in a December executive order have passed without delivery, according to a Friday report from Axios. Three provisions due March 11 remain incomplete. The FTC was supposed to issue guidance on how consumer protection law applies to AI models. The Commerce Department was due to publish an evaluation of state AI laws. Neither has been publicly announced. The missed deadlines raise questions about how forcefully the administration can follow through on its push to preempt state-level AI regulation — even as it intervenes in Colorado's case through the DOJ.

Intel's Q1 2026 earnings call, held this week, showed revenue of $13.6 billion — beating expectations — with AI-driven business lines accounting for 60 percent of that figure, up 40 percent year-on-year. CEO Lip-Bu Tan argued that inference and agentic workloads are restoring the CPU to the center of compute. He cited customer demand for edge AI, robotics, and physical AI as evidence. Intel's 14A process node remains unfinished, with design commitments expected to begin in the second half of 2026. The company still lacks a credible GPU for AI training after canceling its most recent effort.

Nvidia crossed the $5 trillion market cap threshold this week, lifted by the broader AI infrastructure rally. The move came as hyperscalers reaffirmed capital expenditure commitments despite supply chain concerns tied to the US-Iran conflict and Strait of Hormuz disruptions. Moody's analyst David Pan told Axios that hyperscalers are committing roughly $650 billion to US AI infrastructure this year alone — and that the AI economy depends on Qatari helium, Israeli bromine, and LNG tankers with a single 21-mile-wide exit from the Persian Gulf. That physical dependency is now a live risk factor for chip supply chains.

Anthropic is facing compounding operational pressures ahead of a potential IPO that could value the company near $800 billion. Revenue has tripled to about $30 billion this year, driven largely by coding tools. But the past two months have brought model performance complaints, pricing confusion, security incidents, and capacity constraints. OpenAI is actively courting frustrated Anthropic enterprise customers, engaging consulting partners to deploy its Codex platform. The rivalry is now explicitly framed around which company reaches the public markets with more momentum.

The White House accused China of industrial-scale theft of US frontier AI systems in a memo this week, citing tens of thousands of proxy accounts used to extract proprietary model outputs. Michael Kratsios, director of the White House Office of Science and Technology Policy, said the administration would share intelligence with American AI companies about unauthorized distillation attempts. The accusation follows reports of proposed legislation that would sanction entities engaged in query-and-copy attacks on US models. Anthropic previously said it found 24,000 fraudulent accounts used to generate 16 million exchanges with its Claude model.

Band, a Tel Aviv and San Francisco startup, exited stealth this Friday with a $17 million seed round to build dedicated interaction infrastructure for autonomous AI agents. CEO Arick Goomanovsky and CTO Vlad Luzin are targeting the coordination layer between agents that operate across different cloud environments, frameworks, and business owners. The company argues that adding more business logic to fragmented multi-agent systems fails — and that a distinct infrastructure layer is required, analogous to the API gateway for microservices. The round was not led by a named investor in available disclosures.

On deployments. The Home Depot this Friday reported early results from its AI voice agent pilot. The system identifies customer intent in roughly 10 seconds and delivers solutions four times faster than traditional menu-based phone systems. It can initiate service requests, send product links, help customers complete purchases, and assemble a shopping cart from a verbal project description. The deployment converts what was a cost center into a point of conversion — keeping customers within a controlled environment where the retailer can guide purchasing decisions.

Wells Fargo's AI-powered assistant Fargo has surpassed one billion customer interactions — three years after launch. Management highlighted the figure on its earnings call this week as evidence of sustained self-service adoption at scale. The bank frames the milestone as a shift in customer comfort with resolving issues without a human agent. The same pattern is visible across bank earnings commentary this quarter, with management teams consistently citing AI-assisted servicing as a driver of digital engagement and reduced operational cost.

The IRS deployed AI for real-time fraud detection during the return filing process itself — flagging emerging compliance threats as returns arrive. The agency's 126 active AI applications now span audit selection, fraud scoring, and generative AI for drafting information document requests. Revenue agents review and finalize AI-produced first drafts. The criminal investigation unit uses Palantir tools to process suspicious activity reports at speeds that previously required many hours of agent time per case. The IRS has focused enforcement capacity on large corporations, complex partnerships, high-wealth individuals, and digital asset users.

Siemens reported that its Eigen Engineering Agent — deployed across more than 100 companies in 19 countries — executes automation engineering tasks two to five times faster than manual workflows. The system interprets project requirements, generates PLC code, configures industrial systems, and iterates until predefined performance targets are met. Prism Systems used it to generate automation code for legacy environments without manual translation. The deployment addresses a global shortfall Siemens estimates at up to 7 million manufacturing workers by 2030, with roughly 1 in 5 engineering roles currently unfilled.

AppZen launched its AP Inbox Service Center this week — eight prebuilt AI agents that automate vendor email handling for finance teams. The agents cover payment status responses, bank change verification, duplicate invoice detection, vendor statement reconciliation, W-9 compliance routing, and remittance assistance. AppZen's own customer data shows AP reviewers spend as much as one week per month on that work. The bank change agent escalates every request to vendor management with a risk classification based on domain mismatches, urgency language, and unknown senders. Deployment requires no IT involvement.

Ulta Beauty launched an AI shopping assistant called Ulta AI on Ulta.com this week, built with Google's Gemini Enterprise for Customer Experience. The assistant draws on insights from 46 million loyalty members to provide personalized guidance. Separately, Ulta is rolling out agentic commerce across Google surfaces — including AI Mode in Search and the Gemini app — over the next month, using the Universal Commerce Protocol standard. Shoppers can receive product recommendations, compare options, and complete checkout within Google's conversational interfaces. Ulta said in March that AI contributed to better-than-planned Q4 financial performance.

Portal26 launched Agentic Token Controls this week — a module that lets administrators set token budgets for individual agents, specific workflows, or the entire organization. Agents nearing a cap get throttled; those that exceed one can be paused or killed. The company cited Uber as an enterprise that discovered adoption speed and cost predictability are on a collision course. The launch targets a real production problem: multistep autonomous agents can enter recursive loops, over-query systems, or expand tasks beyond scope — generating exponential token usage and surprise invoices with no traditional budget controls to catch them.

Off the radar. Meituan — China's dominant food delivery and local commerce platform — quietly opened testing this Friday for LongCat-2.0-Preview, a new foundation model with over one trillion parameters. The training run was completed entirely on domestic Chinese compute clusters — no NVIDIA hardware. The model supports a one-million-token context window and has been optimized for agentic applications including code generation, complex task planning, and enterprise automation. Chinese tech publication 36kr reported the detail. It's the clearest signal yet that Chinese internet giants are building frontier-scale models on fully domestic infrastructure stacks.

Huawei's HiFloat4 training format — a 4-bit precision format for AI training and inference on Ascend NPUs — outperformed the Western-developed MXFP4 standard in a systematic bakeoff. Huawei researchers tested across three model families: OpenPangu-1B, Llama3-8B, and Qwen3-MoE-30B. HiFloat4 achieved roughly 1 percent relative loss error versus a BF16 baseline. MXFP4 came in at about 1.5 percent. The gap widens as model size increases. This is not just a chip story — it's a precision format story. China is building its own low-level numerical standards explicitly coupled to its own hardware, reducing dependence on Open Compute Project specifications developed in the West.

The UAE announced this Friday that it intends to shift 50 percent of all government sectors, services, and processes to autonomous agentic AI within two years. Sheikh Mohammed bin Rashid Al Maktoum announced the plan on X. Every federal employee will be trained to work with AI systems. The UAE frames this as making it the first government in the world to rely on autonomous AI at this scale. There are no democratic oversight mechanisms or independent press to audit the rollout. For enterprise vendors, this is a procurement signal — the Gulf states are moving from pilot to mandate, and the contract volumes will be substantial.

China's State Grid Corporation earmarked 6.8 billion yuan — about $1 billion — for AI-powered robots to operate its power grid in 2026 alone. The plan covers roughly 8,500 robots for inspecting remote substations and maintaining ultra-high-voltage power lines. When similar plans from China Southern Power Grid are included, total sector investment in embodied intelligence is expected to exceed 10 billion yuan this year. About 5.8 billion yuan of State Grid's budget goes to hardware procurement. This is physical AI deployment at a scale and speed that has no Western equivalent in critical infrastructure — and it's happening largely outside English-language coverage.

A Chinese industrial AI startup called Zhiyong Kaiwu — with a core team from Microsoft China — closed its third funding round in 12 months, raising close to 100 million yuan in an angel-plus round led by Ruifeng Capital. The company builds multi-agent systems for factory floors, with native support for OPC UA industrial protocols and millisecond-level response times. At anchor customer Luxshare Precision, a single AI scheduling agent is performing the work of 6 human employees. SOP automation rates reached 80 percent. New-hire onboarding time dropped from 1.5 days to 2 hours. This is the kind of industrial AI deployment that doesn't make Western tech press — but it's exactly where agentic AI is generating measurable ROI at scale.

Anthropic ran an internal experiment called Project Deal in December 2025 — 69 employees traded real goods via AI agents in a Slack-based marketplace for one week. Each participant had a $100 budget. The key finding: Claude's stronger model consistently secured better prices and closed more deals than the weaker model. The people assigned the weaker agent rated the fairness of their transactions just as highly as those with the stronger model — they had no idea they were losing out. Anthropic published the results this week. The implication for financial services is direct: as AI agents begin handling real negotiations and transactions, model tier becomes a source of invisible economic inequality.

Looking ahead to next week. Anthropic's IPO trajectory is the most consequential financial story to watch. Axios reported this week that the company could be valued near $800 billion at IPO — but it's entering that process with compounding operational problems: model performance complaints, pricing confusion, capacity constraints, and OpenAI actively poaching its enterprise customers. Any new valuation leak, investor day signal, or S-1 filing movement next week will set the tone for how the market prices the AI lab category heading into summer.

The Colorado AI regulation case moves to federal court with the US Justice Department now formally intervening on behalf of xAI. The law is scheduled to take effect June 30. A federal judge will need to rule on whether to block enforcement before that date. This is the first time the Trump administration has directly entered a state AI regulation lawsuit — and the outcome will signal how aggressively the federal government will preempt state-level AI rules. Enterprise legal and compliance teams should track the docket closely. The ruling could affect similar laws in development across a dozen other states.

Watch for any DeepSeek fundraising announcement. Chinese sources at 36kr and the South China Morning Post both reported this week that DeepSeek opened its first external funding round in mid-April — seeking to sell no more than 3 percent of equity, with state-backed funds expected to participate. The round is designed to set a valuation benchmark for employee stock options and stem talent departures to ByteDance and Tencent. One estimate puts the valuation above $100 billion. If a close is announced next week, it will be the most significant private AI valuation signal from China since DeepSeek's R1 launch.

Microsoft, Alphabet, Meta, and Amazon all report earnings next week. Consensus is watching for AI revenue line items, capital expenditure guidance, and any commentary on agent adoption rates. Meta's earnings will be particularly scrutinized — the company announced both a major AWS Graviton deal and a 10 percent workforce reduction in the same week. Investors will want to know whether the AI infrastructure spend is generating measurable returns, or whether operating margin compression is accelerating. Alphabet's commentary on TPU 8t and 8i deployment timelines will also be closely watched after the Google Cloud Next announcements.

The US-Iran conflict and Strait of Hormuz situation remains an active risk for AI infrastructure supply chains. Moody's flagged this week that the AI buildout depends on Qatari helium, Israeli bromine, and LNG tankers with a single 21-mile-wide exit from the Persian Gulf. Any escalation over the weekend that further restricts tanker traffic will hit chip manufacturing inputs — and could move semiconductor stocks at Monday's open. Brent crude was trading near $95 earlier this week. Watch energy prices Sunday evening as a leading indicator for how chip and data center stocks open Monday.

This podcast has a daily production cost. If you enjoy it, support it — the link is on the podcast page. Thank you.