🟣 Sector Brief AI Semiconductors Published Apr 29, 2026 · Timed to NVDA Q1 FY27 print (May 27 AMC)

Custom Silicon vs. NVDA:
The ASIC Challenger Map

NVIDIA holds ~80% of the AI accelerator market — but that share peaked at ~87% in 2024 and is structurally declining. Every major hyperscaler now has custom silicon in production. This is the bear-case companion brief before NVDA's most-watched print.

NVIDIA Accel Share
~80% (FY26)
Custom ASIC CAGR
27–45%
AVGO AI Rev Q1 FY26
$8.4B (+106%)
Vektor's Call
~20% ASIC share by 2027
⚡
00 / 07
Executive Summary
NVIDIA holds ~80% of the AI accelerator market in 2026 ($194B data center revenue FY26), but that share peaked at ~87% in 2024 and is structurally declining.
Every major hyperscaler now has custom silicon in production — AWS Trainium3 is nearly fully subscribed, Google Ironwood (TPU v7) scales to 42.5 exaflops, Meta MTIA v2 is deploying across Facebook/Instagram/WhatsApp, Microsoft Maia 200 is live in Azure.
Custom ASIC shipments are projected to surpass GPU shipments in volume by 2027 (Bloomberg Intelligence), though NVIDIA wins on revenue for years longer.
Broadcom is the actual infrastructure winner — $8.4B AI semiconductor revenue Q1 FY26 (+106% YoY), guided $10.7B Q2 (+140% YoY), $73B AI backlog, $100B FY2027 target.
NVIDIA's moat is real and durable — CUDA's 5M+ developer ecosystem, NVLink/Spectrum-X networking stack, and 100% win rate on frontier model training are not being displaced. The custom silicon threat is primarily an inference and cost-optimization story, not a training story.
Vektor's call: Custom silicon captures ~20% of hyperscaler AI accelerator spend by end of 2027.
🗺️
01 / 07
Landscape Map

Hyperscaler In-House Silicon (Production, Q2 2026)

Challenger Latest Chip Process Key Spec Status Primary Use Case
AWS Trainium3 Trainium3 TSMC 3nm 2.52 PFLOPS FP8, 144GB HBM3e, 4.9TB/s Nearly fully subscribed Training + Inference (Bedrock, Claude)
Google TPU v7 (Ironwood) Ironwood TSMC custom 4,614 TFLOPs FP8, 192GB HBM, 7.2TB/s, 157W GA 2025 · 42.5 exaflops clusters Inference-first (Gemini, cloud customers)
Meta MTIA v2 family MTIA 300/400/500 TSMC 5nm (RISC-V) 3–4x over v1 generation In production · Meta datacenters Recommendation/ranking (FB, IG, WhatsApp)
Microsoft Maia 200 Maia 200 TSMC 3nm 216GB HBM3e, 10 PFLOPS FP4, 7TB/s Live in Azure US Central + US West 3 Inference (Copilot, Azure OpenAI, M365)
Google TPU v6 (Trillium) Trillium — 4.7x perf over v5e, 256-chip pods GA in Google Cloud Training + Inference

Sources: AWS, Google Cloud, Meta Engineering, Microsoft Azure announcements Q1–Q2 2026

Merchant ASIC Enablers

Vendor Role Key Customers Market Position
Broadcom (AVGO) Designs custom XPUs + Ethernet switches Google (TPU), Meta (MTIA), Microsoft (Maia), OpenAI, Anthropic, ByteDance ~60–70% of custom AI ASIC market
Marvell (MRVL) Custom ASIC design + networking silicon Amazon (Trainium2; lost Trainium3 to Alchip), others ~20–30% of custom AI ASIC market
Deployment scale context (April 2026): AWS has 1.4M+ Trainium chips deployed across all generations. Google has committed to deploying up to 1M TPUs (~1 GW capacity) for Anthropic alone. Meta+Broadcom extended MTIA partnership through 2029 for 1GW+ of 2nm custom silicon. Trainium3 nearly fully subscribed since first shipments in January 2026.
💰
02 / 07
Internal-vs-Merchant Economics

The core thesis is simple: NVIDIA's H100 costs ~$3,320 to manufacture and sells for ~$28,000 — an 88% gross margin. Hyperscalers spending billions annually are paying for NVIDIA's software moat, not marginal silicon cost. Custom ASICs eliminate that premium for workloads where the moat doesn't matter.

Training Cost Comparison
Platform Cloud Cost/hr vs. H100 Basis
NVIDIA H100 $3.80–4.50/hr Baseline General-purpose training
NVIDIA B200 (Blackwell) $5.19–8.60/hr 3–4x throughput FP4 native, 4x training speed
Google TPU v6e (Trillium) ~$2.70/chip-hr 2x better perf/$ Committed at $0.39/chip-hr
AWS Trainium2 ~$1.40–2.20/hr ~30–40% cheaper Per AWS benchmarks vs. P4
AWS Trainium3 ~¼ H100 cluster cost 50% better price-perf Per AWS shareholder letter
Inference Cost Comparison
Platform Power FP8 TFLOPs Cost/token Est. Notes
NVIDIA B200 ~700W ~4,500 Low (FP4 native) 15x H100 inference throughput at system level
Google TPU Ironwood 157W 4,614 Very low at scale 2x perf/watt vs. Trillium, 30x vs. TPU v2
Microsoft Maia 200 750W TDP >5,000 FP8 Claims 30% better 3x FP4 perf vs. Trainium3 (Microsoft claim)

Sources: AWS, Google, Microsoft announced specs; SemiAnalysis; cloud pricing benchmarks

The Hidden Cost — Advanced Packaging: Custom ASICs and NVIDIA GPUs compete for the same scarce CoWoS supply (TSMC). CoWoS running 3x short of demand through 2026. HBM3e remains undersupplied through CY2027. Critically, NVIDIA receives "VVP" (Very Very Preferred) DRAM pricing from Samsung/SK Hynix/Micron — a hidden structural cost advantage hyperscalers cannot replicate.

Bottom line: For fixed, high-volume, predictable inference workloads, custom ASICs deliver 30–50% better cost-per-token than merchant GPUs. For training frontier models, the economics favor NVIDIA because no custom ASIC has the software ecosystem to support general model development.
📊
03 / 07
The TAM Split
$160–280B
AI Accelerator Hardware TAM · FY2027 Est. · Wide Analyst Range
FY2026 Accelerator Spend Structure
Category FY2026 Est. FY2027 Est. CAGR
Total hyperscaler capex $660–690B ~$800B+ ~20%
AI-specific (~75%) ~$500B ~$600B —
AI accelerator hardware TAM ~$43.75B ~$160–280B Wide range
NVIDIA share (~80%) ~$35B of accel. TAM Declining to ~75% GPU CAGR: 16%
Custom ASIC share (~10–15%) ~$4–7B Growing to 20–25% ASIC CAGR: 27–45%

Sources: Goldman Sachs, Bloomberg Intelligence, TrendForce, Fortune Business Insights

Capturable by ASICs (inference-dominated, fixed workloads):

Social media recommendation and ranking (Meta's entire MTIA use case)
Cloud inference for stable, production-deployed models (Bedrock, Copilot, Gemini serving)
Cost-sensitive fine-tuning on known architectures
Realistic share: ~15–20% of accelerator spend by 2027

Locked into NVIDIA (for now):

Frontier model training — every major model (GPT-5, Gemini 2.0, Claude 4, Llama 4) trained on NVIDIA hardware
Sovereign AI purchases (UAE: 500K+ GPUs/yr; Saudi Humain deal)
Enterprise customers without hyperscaler-scale engineering to build custom stacks
NVLink/Spectrum-X/InfiniBand networking stack — you can't replace the GPU without replacing the entire fabric
Realistic NVIDIA floor: ~75% of revenue through 2028
Key structural observation: Custom ASIC shipments will surpass GPU shipments in volume by 2027 — but GPUs maintain revenue dominance because NVIDIA charges 5–10x more per chip. ASICs win on unit economics for fixed workloads, GPUs win on flexibility and absolute performance for the frontier.
📈
04 / 07
AVGO + MRVL ASIC Revenue Ramp

Broadcom (AVGO) — The Real Infrastructure Play

Quarter AI Semi Revenue YoY Growth Notable
Q1 FY2025 $4.1B +77% Strong networking contribution
Q4 FY2025 $6.5B +74% Backlog grew to $73B
Q1 FY2026 $8.4B +106% Record; beat estimates
Q2 FY2026 (guide) $10.7B +140% Growth accelerating
FY2027 target $100B+ — CEO Hock Tan: "line of sight"

The $73B backlog breakdown (as of March 2026): ~$53B in custom XPUs (ASICs), ~$20B in AI networking (Ethernet switches, Tomahawk 6). Delivery horizon: 18 months. Customer concentration: Google (locked through 2031), Meta (1GW 2nm deal through 2029), ByteDance, OpenAI, 5th hyperscale customer ($1B initial order H2 2026, reportedly Anthropic-adjacent).

AI networking segment (the overlooked angle): Broadcom's Tomahawk 6 — the only 102.4 Tbps switch on the market — generated ~$2.8B in Q1 (+60% YoY) and is guided to ~$4.3B in Q2. Networking alone in Q2 would exceed Cisco's total quarterly revenue. This networking monopoly is as durable as NVIDIA's CUDA lock. Mizuho estimate: Anthropic deal alone = $21B in Broadcom revenue in 2026, $42B in 2027.

Marvell (MRVL) — The Cautionary Tale

Metric Status
Trainium2 design win ✅ Won
Trainium3 design win ❌ Lost to Alchip
Primary AI customer Amazon (AWS) — heavy concentration
Google TPU win (reported) One TPU generation reportedly won
ASIC market share ~20–30% (vs. AVGO's 60–70%)

MRVL's situation is a warning on customer concentration: Amazon shifting chip design to Alchip for Trainium3 shows hyperscalers will constantly re-bid ASIC design work.

🏰
05 / 07
Where NVIDIA Still Wins

The moats that aren't going away — and why the bear case is narrower than the headlines suggest.

Moat 1
Frontier Model Training — 100% Win Rate
Every major foundation model trained in 2025–2026 — GPT-5, Gemini 2.0, Claude 4, Llama 4 — ran on NVIDIA hardware. Zero exceptions. No custom ASIC has the software ecosystem to support general model development where architectures are still evolving daily.
Moat 2
CUDA — The Software Moat (5M+ developers, 15 years deep)
Switching from CUDA requires 6–12 months of engineering effort to port complex applications. AMD's ROCm 7.0 (expected late 2026) targets full PyTorch compatibility but enterprise adoption is nascent. The moat is so deep that EU/US antitrust regulators are investigating CUDA bundling as potentially anti-competitive behavior — that's the tell.
Moat 3
NVLink / Spectrum-X / InfiniBand — The Hidden Networking Moat
NVIDIA controls a three-tier interconnect stack: NVLink (scale-up within rack at 1.8TB/s bidirectional in Blackwell), Spectrum-X (scale-out between racks, AI-optimized), InfiniBand (ultra-low-latency training fabric). You don't replace NVIDIA GPUs — you replace the entire interconnect architecture. NVLink Fusion (announced 2026) is NVIDIA's strategic play to make this lock apply to third-party ASICs too.
Moat 4
Rubin Platform (2026) — Widening the Lead
Vera Rubin, built on TSMC 3nm with HBM4 memory: 10x inference efficiency improvement over Blackwell. NVIDIA's one-year architecture cadence (Hopper → Blackwell → Rubin → Rubin Ultra → Feynman) keeps competitors perpetually catching up.
Moat 5
Sovereign AI — The Political Moat
UAE: 500K+ NVIDIA GPUs per year through 2027. Saudi Humain deal: hundreds of thousands of chips over five years. These sovereign AI purchases are geopolitically motivated and explicitly NVIDIA-branded. No custom ASIC ecosystem can serve this. The sovereign channel is an entirely separate demand source.
Moat 6
Enterprise Inference Where CUDA Tooling Matters
Enterprises without hyperscale engineering teams can't build custom silicon stacks. NVIDIA NIM (microservices), TensorRT, and the managed inference ecosystem are a complete solution. Custom ASICs require massive in-house engineering. This keeps the long tail of enterprise AI on NVIDIA. FY2026 financials: $215.9B revenue (+65% YoY), 74.5% gross margin.
👁️
06 / 07
3 Things to Watch in the Next 90 Days
Watch #1
Amazon Q1 2026 Earnings — Project Rainier & Trainium3 Capacity
Signal: Project Rainier commentary and Trainium3 capacity disclosures.

AWS CEO Andy Jassy's shareholder letter already stated: "Trainium2 has largely sold out. Trainium3 just started shipping and is nearly fully subscribed." The earnings call will reveal whether external customers (beyond Anthropic) are adopting Trainium3, and whether AWS custom silicon is materially displacing NVIDIA GPU purchases in Bedrock workloads. A Trainium3 external GA announcement would be a meaningful bear-case catalyst for NVIDIA.

What to track: Mention of "Trainium" on call, AWS AI revenue attribution, and any new multi-GW capacity commitments.
Watch #2
Google TPU v7 (Ironwood) Customer Announcements Outside Google Cloud
Signal: The Anthropic deal (up to 1M TPUs) is confirmed. The Meta leasing TPU capacity deal (advanced discussions reported for 2026 start) is not yet confirmed.

If Meta announces it's leasing Google TPUs while simultaneously deploying its own MTIA chips, it signals that even the hyperscalers with in-house silicon aren't betting everything on it — which is actually bullish for NVIDIA's longevity and bullish for Google's monetization. Watch for any Google Cloud Next announcements or Meta capex commentary about TPU capacity.
Falsification: If the Meta/Google TPU deal goes through, ASIC adoption is accelerating faster than expected.
Watch #3
Broadcom Q2 FY2026 Earnings (Early June 2026)
Signal: Will AI semiconductor revenue actually hit the guided $10.7B (+140% YoY)?

The $73B backlog gives high visibility, but margins are under pressure as AVGO shifts to more hardware-intensive system revenue. A gross margin miss (guided ~77%) alongside the revenue beat would confirm the "hardware-heavy" phase thesis — growth is real but profit quality is compressing. Conversely, a clean beat with stable margins would validate the $100B FY2027 trajectory.

Also watch: Whether AVGO discloses a 6th major customer (OpenAI is the open secret) and any update on Meta's 2nm 1GW program timeline.
🎯
07 / 07
Vektor's Call
// The call
Custom silicon will capture ~20% of hyperscaler AI accelerator spend by end of 2027, up from ~10–15% in 2026.
// The falsifiable claim
If Broadcom misses its $100B AI semiconductor revenue target for FY2027 by more than 20% (i.e., comes in below $80B), the custom ASIC substitution thesis has materially stalled. Score this on Broadcom's Q4 FY2027 earnings report.

The bear case for NVIDIA is real but narrower than the headlines suggest. The custom silicon threat is concentrated in inference on fixed workloads at hyperscale — not training, not enterprise, not sovereign AI. At NVIDIA's current trajectory ($215B revenue, $194B data center, 80% accelerator share), even a 10-point share loss to ASICs over three years represents hundreds of billions in addressable market that NVIDIA will cede to Broadcom and the hyperscalers themselves. That's a slow grind, not a cliff.

The actual NVIDIA risk isn't custom silicon directly — it's NVIDIA's own networking play (NVLink Fusion) possibly overreaching into ASIC territory and triggering hyperscaler backlash, or AMD's ROCm finally reaching CUDA parity and allowing the existing $194B revenue base to defect on inference.

Hold both truths: The custom silicon map is real, scaling rapidly, and measurably eroding NVIDIA's total addressable share. NVIDIA is also the most defensible $215B revenue business in enterprise technology history. These are not contradictions.
Research conducted April 29, 2026. Key sources: Goldman Sachs hyperscaler capex estimates, Bloomberg Intelligence AI Accelerator Chips 2026 Outlook, SemiAnalysis Accelerator Industry Model (public disclosures), Broadcom Q1 FY2026 earnings (March 4, 2026), NVIDIA FY2026 annual results, TechCrunch exclusive Amazon Trainium lab tour (March 2026), Google Cloud Next '25 Ironwood announcement, Microsoft Maia 200 launch (January 2026), Meta MTIA v2 announcement (February 2026), Anthropic/Amazon partnership expansion (2026), TrendForce custom ASIC shipment projections.

All figures in USD unless noted. Market share estimates represent revenue share, not unit volume, unless specified. Forward-looking statements sourced to named companies or research firms. This brief is intelligence analysis, not investment advice.
// Tickers in this brief
⚡ Vektor Intelligence

Get briefs like this
for any company — $29/mo

Companies, markets, and trends. Drop in any name and get a structured intelligence brief in 40 seconds — earnings previews, sector trackers, competitive deep dives.

Start for $29/month → — or get your first brief free —
✓ You're in — we'll send your free brief shortly.

No credit card. Cancel anytime.