Rubin CPX: Nvidia's GPU Built to Generate Video and Software at Exascale

Nvidia’s next leap in AI hardware

Nvidia announced Rubin CPX, a purpose-built AI chip designed to handle exceptionally large context workloads such as full-length video creation and whole-project software generation. Targeted for a late 2026 rollout, Rubin CPX integrates video decode/encode and AI inference into one tightly coupled system to remove I/O bottlenecks and accelerate end-to-end media and code pipelines.

Architecture and raw performance

Rubin CPX emerges from the Vera Rubin NVL144 rack-scale architecture and delivers extreme density: Nvidia cites 8 exaflops of compute, 100 TB of memory, and very high bandwidth inside the rack. The company claims about a 7.5× performance uplift versus prior Blackwell-based systems. Crucially, Rubin CPX is optimized for long-context inference, able to work with context windows above 1 million tokens, which is essential for tasks like video generation that must reason across long temporal sequences or for code generation that must understand entire repositories.

Economics and monetization potential

Nvidia projects that a $100 million deployment of Rubin CPX infrastructure could enable up to $5 billion in token-driven revenue, suggesting new commercial models for AI-as-a-service built around tokenized consumption. For platform operators and API providers, Rubin CPX promises to make large-context, latency-sensitive services economically viable at scale.

Practical use cases

For media and entertainment, Rubin CPX could enable cleaner autonomous editing, instantaneous highlight reels, real-time video transformations and other forms of generative long-form content that were previously limited by context size and throughput. For developers and enterprises, the larger context window means AI assistants could generate or refactor code spanning entire projects rather than short snippets, improving utility for real engineering workflows.

Industry context and supply chain

Nvidia says Rubin GPU and Vera CPU components are in tape-out and fabrication at TSMC, signaling momentum toward the 2026 timeline. The company also noted that demand is strong while maintaining that existing lines like H100 and H200 are not sold out. Meanwhile, the broader high-performance computing landscape is active: Germany has just activated the Jupiter exascale supercomputer powered by Nvidia technologies, underscoring global investment in big-AI infrastructure.

What to watch next

Key milestones to track include Rubin CPX silicon tape-out confirmations, system-level performance benchmarks on real video and code workloads, partner integrations for media platforms and developer tools, and how token-based pricing models evolve as operators begin to monetize long-context AI capabilities.