Nvidia's Rubin Platform: 6 Chips, 50 Petaflops, and Why It Matters

Nvidia doesn’t do small announcements. At CES 2026, Jensen Huang unveiled the Rubin platform - named after astronomer Vera Rubin - and it’s the most aggressive chip roadmap the company has ever put forward. Six new chips. A rack-scale AI supercomputer. And performance numbers that make Blackwell look like a warm-up.

With Nvidia reporting earnings on February 25 and the entire AI trade hanging in the balance, understanding Rubin isn’t optional anymore. It’s where the next $500 billion in AI infrastructure is heading.

What Is Rubin?

Rubin isn’t just a GPU. It’s a complete platform - CPU, GPU, networking, storage, and security - all co-designed from the ground up. Think of Blackwell as Nvidia owning the engine. Rubin is Nvidia owning the entire car.

Here are the six chips:

Chip	What It Does
Rubin GPU	50 petaflops of FP4 inference, 35 petaflops training. 288GB HBM4 memory.
Vera CPU	88 custom Olympus cores, Armv9.2. Built for agentic AI workloads.
NVLink 6 Switch	3.6 TB/s bandwidth per GPU. The glue holding the rack together.
ConnectX-9 SuperNIC	Integrated networking for rack-scale AI.
BlueField-4 DPU	Storage processor with built-in SSD for key-value cache.
Spectrum-6 Switch	200G SerDes with co-packaged optics for AI-optimized fabrics.

The flagship configuration is the Vera Rubin NVL72: a single rack containing 72 Rubin GPUs, 36 Vera CPUs, and all the networking and storage baked in. Total interconnect bandwidth: 260 TB/s.

For context, the entire internet backbone handles roughly 1,000 TB/s. A single Nvidia rack now does a quarter of that internally.

Rubin vs Blackwell: The Numbers

This is where it gets wild.

Metric	Blackwell	Rubin	Improvement
FP4 inference	10 PFLOPS	50 PFLOPS	5x
FP4 training	10 PFLOPS	35 PFLOPS	3.5x
HBM bandwidth	8 TB/s	22 TB/s	2.8x
NVLink bandwidth/GPU	1.8 TB/s	3.6 TB/s	2x
Inference cost per token	Baseline	10x lower	10x
GPUs to train MoE models	Baseline	4x fewer	4x
Transistor count	Baseline	1.6x more	1.6x

The headline number: 10x reduction in inference token cost. If you’re running a large language model in production today, Rubin means your inference bill drops by 90% at the same throughput. Or you get 10x the throughput at the same cost.

For training, the 4x GPU reduction for Mixture-of-Experts models is massive. A model that required 4,096 Blackwell GPUs could train on 1,024 Rubin GPUs. That’s not just cheaper - it’s fundamentally easier to coordinate.

Why This Architecture Matters

Previous Nvidia generations were primarily about a better GPU. Rubin is different - it’s about system-level co-design.

The Vera CPU

For the first time, Nvidia has a serious server CPU. The Vera has 88 custom “Olympus” cores (Armv9.2), optimized specifically for agentic AI workloads - things like multi-step reasoning, tool use, and chain-of-thought inference where the CPU orchestrates while the GPU computes.

Meta is already the first customer deploying Vera CPUs as standalone processors in their data centers, separate from GPUs. That’s a direct shot at Intel and AMD’s server CPU business.

The BlueField-4 DPU

Here’s a detail most coverage missed: the BlueField-4 has an integrated SSD for key-value cache. In LLM inference, the KV cache is often the bottleneck - it stores the attention context for every token in the sequence. By putting this directly on the DPU, Nvidia eliminates a major memory bandwidth chokepoint.

Cable-Free Rack Design

The NVL72 rack uses a modular, cable-free tray design that enables 18x faster assembly and servicing than Blackwell racks. When you’re deploying tens of thousands of racks across global data centers, this is an operational game-changer. Fewer cables = fewer failure points = higher uptime.

The $500 Billion Pipeline

During last quarter’s earnings, Nvidia disclosed $500 billion in combined Blackwell and Rubin visibility through the end of calendar 2026, with roughly $150 billion already shipped. The customers buying this hardware aren’t experimenting - they’re building permanent AI infrastructure.

The buyer list reads like a tech industry roster:

Meta signed a multi-year deal for millions of Nvidia chips, including Vera CPUs and Rubin systems
Microsoft, Google, Amazon are all building Nvidia-powered AI data centers at unprecedented scale
Oracle, CoreWeave, Lambda are expanding GPU cloud capacity

The Mag 7 alone plans to spend roughly $700 billion on AI capex in 2026. A significant chunk of that flows directly through Nvidia.

The Roadmap: Rubin → Rubin Ultra → Feynman

Nvidia’s cadence is now annual, and the roadmap is public:

Platform	Release	Performance
Blackwell	Shipping now	Baseline
Rubin	H2 2026	5x inference, 3.5x training
Rubin Ultra	H2 2027	100 PFLOPS FP4, up to 1TB HBM4e
Feynman	2028	5-20x over Rubin (estimated)

Rubin Ultra doubles Rubin’s compute to 100 petaflops and jumps to 1 TB of HBM4e memory per chip. For reference, that’s more memory than most servers have in total RAM today - on a single GPU.

Feynman, named after physicist Richard Feynman, is the 2028 architecture. Jensen has been teasing it as something that will “surprise the world.” Details are scarce, but it’s expected to push into HBM5 territory and target trillion-parameter models natively.

This annual cadence is the real moat. AMD and Intel are on 2-3 year cycles. Nvidia is iterating every 12 months, making it nearly impossible for competitors to close the gap.

What This Means for the Industry

For AI companies

Inference costs dropping 10x changes the economics of every AI product. Features that were too expensive to ship at scale - real-time video generation, persistent AI agents, multi-modal reasoning - become viable. Expect a wave of AI products in late 2026 / early 2027 that only exist because Rubin made them affordable.

For Nvidia’s competitors

AMD’s MI300X offers 192GB of HBM3 memory at competitive pricing, and Intel is targeting the budget segment with Gaudi chips. But Rubin’s full-stack approach - CPU, GPU, DPU, networking, all co-designed - is incredibly difficult to replicate. Having a competitive GPU isn’t enough when Nvidia controls the entire rack.

For investors

Nvidia reports Q4 FY2026 earnings on February 25. The Street expects $65.7B revenue and $1.53 EPS. But the real signal is forward guidance. If Nvidia guides above the expected $71B for next quarter, it confirms Rubin pre-orders are accelerating. If below, the entire AI infrastructure thesis gets questioned.

With 39 analysts at “Strong Buy” and price targets ranging from $100 to $352, Nvidia is the most consequential earnings report of the quarter. Maybe the year.

The Bottom Line

Nvidia isn’t just selling GPUs anymore. With Rubin, they’re selling complete AI data center infrastructure - compute, memory, networking, storage, and security - in a single rack. The 10x cost reduction in inference and the annual upgrade cadence create a flywheel that competitors can’t easily break.

The question isn’t whether Rubin is impressive. It is. The question is whether the $700 billion bet on AI infrastructure will generate returns. If it does, Nvidia becomes the most important company of the decade. If it doesn’t, well - that’s a $700 billion problem for everyone, not just Nvidia.

Either way, the Rubin platform is the most ambitious chip launch since the original CUDA ecosystem. And Jensen Huang is already teasing what comes next.

What Is Rubin?#

Rubin vs Blackwell: The Numbers#

Why This Architecture Matters#

The Vera CPU#

The BlueField-4 DPU#

Cable-Free Rack Design#

The $500 Billion Pipeline#

The Roadmap: Rubin → Rubin Ultra → Feynman#

What This Means for the Industry#

For AI companies#

For Nvidia’s competitors#

For investors#

The Bottom Line#

Comments