Nvidia reveals Rubin CPX GPU built for extensive long-context AI inference tasks

Nvidia has officially unveiled the **Rubin CPX GPU**, a groundbreaking graphics processing unit specifically designed for **extensive long-context AI inference tasks** that handle context windows exceeding one million tokens. This new GPU is aimed at accelerating demanding AI applications such as software coding and generative video, delivering unprecedented speed and efficiency in processing massive context sequences[1][2][3].

The Rubin CPX forms a key component of Nvidia’s integrated *...

The Rubin CPX forms a key component of Nvidia’s integrated **Vera Rubin NVL144 CPX platform**, which combines 144 Rubin CPX GPUs, 144 Rubin GPUs, and 36 Vera CPUs in a single rack. This platform delivers a staggering **8 exaflops of AI compute performance**, along with **100TB of high-speed memory** and **1.7 petabytes per second of memory bandwidth**, resulting in a system that offers over 7.5 times the AI performance of previous-generation Nvidia systems like the GB300 NVL72[1][3].

Key technical highlights of the Rubin CPX include:

- 128 GB of GDDR7 memory for fast data access[3][4]. - H...

- **128 GB of GDDR7 memory** for fast data access[3][4]. - Hardware support for video decoding and encoding, enabling efficient generative video workflows[3]. - 30 petaFLOPs of NVFP4 compute power. - Specialized hardware delivering 3x acceleration in attention mechanisms compared to Nvidia’s earlier GPUs, optimizing long-context sequence processing[3].

Nvidia’s Rubin CPX is built around a new architecture tailor...

Nvidia’s Rubin CPX is built around a new architecture tailored for the compute-intensive "context phase" of AI inference, which is critical when handling very large context windows. It complements existing Rubin GPUs and Vera CPUs by focusing on maximizing throughput and efficiency during this phase. This design supports Nvidia’s broader vision of **disaggregated inference infrastructure**, enabling scalable, high-performance AI deployments that can handle complex, long-sequence applications such as advanced software development tools and high-definition video generation[2][3].

Industry innovators including Cursor, Runway, and Magic are...

Industry innovators including Cursor, Runway, and Magic are already exploring the Rubin CPX’s capabilities to accelerate their AI applications, highlighting its potential to transform AI workloads that require massive context[1].

The Rubin CPX GPU is expected to be commercially available b...

The Rubin CPX GPU is expected to be commercially available by the **end of 2026**, giving enterprises new opportunities to monetize large-scale AI inference with improved performance and efficiency[2][4].

This launch underscores Nvidia’s continued leadership in AI...

This launch underscores Nvidia’s continued leadership in AI hardware, following record data center sales and ongoing investment in AI infrastructure technologies that push the boundaries of what AI systems can achieve in terms of scale and complexity[2].

🔄 Updated: 9/9/2025, 5:00:21 PM

Nvidia's Rubin CPX GPU, designed for extensive long-context AI inference tasks, has drawn regulatory attention amid ongoing US government controls on AI chip exports. Following Nvidia's recent announcement that the US government will take a 15% cut of chip sales to China, reflecting increased oversight on sensitive semiconductor technologies, it remains to be seen how these regulatory measures will affect Rubin CPX’s commercial availability, especially given prior export restrictions and investigations into security risks by Chinese authorities[1][4]. The GPU is expected to be available by the end of 2026, pending such regulatory approvals[2].

🔄 Updated: 9/9/2025, 5:10:24 PM

Following NVIDIA’s announcement of the Rubin CPX GPU designed for extensive long-context AI inference, the market reacted positively, driving NVIDIA's stock up by approximately 3.7% on September 9, 2025. Investors highlighted the GPU's potential to boost AI workloads with its 8 exaflops performance in a single rack and estimated revenue impact, citing analysts' projections of $5 billion in token revenue per $100 million invested as a strong growth catalyst. A spokesperson noted that the Rubin CPX "marks a significant leap in AI infrastructure, enabling unprecedented efficiency for million-token applications" which helped fuel renewed investor confidence[1][2][3].

🔄 Updated: 9/9/2025, 5:20:24 PM

NVIDIA today unveiled the Rubin CPX GPU, specifically engineered for massive long-context AI inference tasks exceeding one million tokens, targeting applications like software coding and generative video[1][2]. The Rubin CPX features 128GB of GDDR7 memory, delivers 30 petaFLOPs of NVFP4 compute, and integrates into the Vera Rubin NVL144 CPX platform, which offers 8 exaflops of AI performance and 100TB of fast memory per rack—7.5 times more powerful than prior GB300 systems[1][3][4]. NVIDIA highlighted that AI innovators such as Cursor, Runway, and Magic are already exploring Rubin CPX to accelerate their workloads, with the GPU expected to launc

🔄 Updated: 9/9/2025, 5:30:38 PM

NVIDIA’s unveiling of the Rubin CPX GPU, delivering 8 exaflops of AI performance and 100TB of fast memory per rack, marks a significant shift in the competitive AI inference landscape by offering 7.5 times the performance of previous GB300 NVL72 systems[1]. Slated for release by late 2026, this new class of GPU targets massive long-context tasks like million-token coding and generative video, positioning NVIDIA ahead in enabling unprecedented scale and efficiency for AI innovators such as Cursor and Runway[1][3]. This advancement intensifies competition by setting a new benchmark for large-scale inference capabilities, compelling rivals to accelerate development of similarly high-memory, high-throughput AI hardware.

🔄 Updated: 9/9/2025, 5:40:42 PM

The U.S. government will impose a **15% tax on Nvidia chip sales to a specific region**, reflecting regulatory involvement following Nvidia's Rubin CPX GPU reveal[1]. This comes amid Nvidia’s unveiling of the Rubin CPX GPU, designed for massive long-context AI inference tasks, though no other direct government responses or regulatory actions regarding the GPU itself have been reported.

🔄 Updated: 9/9/2025, 5:51:00 PM

NVIDIA has unveiled the Rubin CPX GPU, a specialized accelerator delivering up to **30 petaflops of NVFP4 compute power** and equipped with **128GB of GDDR7 memory**, designed for massive long-context AI inference tasks such as million-token coding and generative video[1][2]. Integrated within the Vera Rubin NVL144 CPX platform, the system achieves **8 exaflops of aggregate AI compute**, **100TB of fast memory**, and **1.7 petabytes per second of memory bandwidth**, offering a **7.5× performance improvement** over previous NVIDIA GB300 NVL72 architectures[1][3][4]. Rubin CPX features a **3x acceleration in attention mechanisms**, enabling AI

🔄 Updated: 9/9/2025, 6:01:07 PM

The U.S. government has announced it will take a **15% cut from chip sales** related to Nvidia's new Rubin CPX GPU deployment region, signaling regulatory financial participation amid the chip’s rollout for massive AI applications[1]. No further specific regulatory restrictions or policy statements have been reported yet regarding the Rubin CPX’s advanced AI capabilities.

🔄 Updated: 9/9/2025, 6:11:10 PM

Nvidia has unveiled the Rubin CPX GPU, engineered specifically for extensive long-context AI inference tasks, capable of efficiently handling workloads exceeding 1 million tokens. Scheduled for release by the end of 2026, Rubin CPX is designed to significantly boost performance and efficiency in large-scale inferencing, including applications in video generation and code synthesis[2][3][1]. This GPU represents a new class of inference accelerator, integrated into the Vera Rubin NVL144 CPX platform to empower advanced AI deployments requiring vast contextual understanding[1][3].

🔄 Updated: 9/9/2025, 6:21:06 PM

Nvidia’s announcement of the Rubin CPX GPU, optimized for AI inference with context windows exceeding 1 million tokens, marks a significant shift in the competitive landscape by setting a new standard for long-context AI tasks such as video generation and software development[1][3]. Slated for release at the end of 2026, the Rubin CPX, as part of the Vera Rubin NVL144 CPX platform, aims to surpass existing solutions in large-scale inference performance, reinforcing Nvidia’s dominance amid its recent $41.1 billion data center revenue surge[1][2]. This move intensifies competition, forcing rivals to innovate around large-context and disaggregated inference architectures.

🔄 Updated: 9/9/2025, 6:31:09 PM

Nvidia's newly unveiled Rubin CPX GPU, embedded in the Vera Rubin NVL144 CPX platform, is designed specifically for extensive long-context AI inference workloads and is expected to launch by the end of 2026[1]. Industry experts highlight this as a "new class of GPU" that promises significant performance boosts in handling large-scale AI tasks, addressing a critical need for AI models requiring long-range context processing[1]. Analysts emphasize that Rubin CPX’s architecture could set a new standard in AI inference efficiency, though detailed benchmarks and specs are awaited closer to release[1].

🔄 Updated: 9/9/2025, 6:41:14 PM

NVIDIA has unveiled the Rubin CPX GPU, delivering up to **30 petaflops of NVFP4 precision compute** with **128GB of GDDR7 memory**, specifically optimized for extensive long-context AI inference tasks such as million-token coding and generative video[1][2][3]. Integrated into the Vera Rubin NVL144 CPX platform, it enables **8 exaflops of AI compute**, **100TB of fast memory**, and **1.7 petabytes/second memory bandwidth** in a single rack, offering **3x faster attention performance** than prior NVIDIA GB300 systems, thereby significantly boosting throughput and efficiency in large-scale, context-heavy generative AI workloads[2][3][5]. This architecture enables

🔄 Updated: 9/9/2025, 6:51:18 PM

NVIDIA has unveiled the Rubin CPX GPU, a breakthrough design delivering **30 petaflops of NVFP4 compute precision** and equipped with **128GB of GDDR7 memory**, optimized for extensive long-context AI inference tasks such as million-token software coding and generative video[1][2][3]. Integrated within the Vera Rubin NVL144 CPX platform, 144 Rubin CPX GPUs combined with 144 Rubin GPUs and 36 Vera CPUs provide up to **8 exaflops of AI compute**, **100TB of fast memory**, and **1.7 petabytes per second of memory bandwidth**, achieving 3x faster attention processing compared to previous NVIDIA GB300 NVL72 systems, thus enabling significant acceleration of long

🔄 Updated: 9/9/2025, 7:01:20 PM

Following Nvidia's unveiling of the Rubin CPX GPU, designed for extensive long-context AI inference with 8 exaflops of compute power, the U.S. government announced it will impose a 15% cut on chip sales to the region, signaling regulatory measures tied to advanced AI hardware transactions[1]. Nvidia CEO Jensen Huang emphasized the Rubin CPX's role in "the next-generation AI computing frontier," potentially influencing government oversight due to its unmatched AI processing capabilities and significant economic impact, estimated at $5 billion in token revenue per $100 million invested[1][2].

🔄 Updated: 9/9/2025, 7:11:16 PM

The US government has announced it will take a **15% cut on chip sales** related to Nvidia's new Rubin CPX GPU, signaling regulatory scrutiny amid the GPU's deployment for large-scale AI inference tasks[1]. This move reflects increased government involvement in semiconductor trade and AI technology distribution as Nvidia's Rubin CPX enables unprecedented AI processing power with **8 exaflops** and massive memory capacity, raising potential national security and export control concerns[2][4].

🔄 Updated: 9/9/2025, 7:21:16 PM

NVIDIA has unveiled the Rubin CPX GPU, a monolithic die design built on the Rubin architecture with 128GB of GDDR7 memory and up to 30 petaflops of NVFP4 compute performance, specifically engineered for massive long-context AI inference tasks involving million-token sequences[1][2][3]. Integrated within the Vera Rubin NVL144 CPX platform, it delivers 8 exaflops of AI compute, 100TB of fast memory, and 1.7 petabytes per second of memory bandwidth, achieving 3x faster attention and 7.5x higher overall performance compared to prior NVIDIA GB300 systems, enabling transformative advances in software development and generative video AI workflows[1][3]. This

Nvidia reveals Rubin CPX GPU built for extensive long-context AI inference tasks

Table of Contents

The Rubin CPX forms a key component of Nvidia’s integrated *...

- 128 GB of GDDR7 memory for fast data access[3][4]. - H...

Nvidia’s Rubin CPX is built around a new architecture tailor...

Industry innovators including Cursor, Runway, and Magic are...

The Rubin CPX GPU is expected to be commercially available b...

This launch underscores Nvidia’s continued leadership in AI...

Latest News

Apple unveils iPhone Air as its lightest and slimmest model ever released

Apple's new Center Stage camera uses AI to keep you perfectly framed during video calls

Enchanted by the sleek new iPhone Air’s ultra-thin design and features

SpaceX Invests $17 Billion to Revolutionize Starlink Direct-to-Cell Service

I’m drawn to Apple’s new iPhone Air, but the iPhone 17 offers more value

Nvidia reveals Rubin CPX GPU built for extensive long-context AI inference tasks

Table of Contents

The Rubin CPX forms a key component of Nvidia’s integrated *...

- **128 GB of GDDR7 memory** for fast data access[3][4]. - H...

Nvidia’s Rubin CPX is built around a new architecture tailor...

Industry innovators including Cursor, Runway, and Magic are...

The Rubin CPX GPU is expected to be commercially available b...

This launch underscores Nvidia’s continued leadership in AI...

Latest News

Apple unveils iPhone Air as its lightest and slimmest model ever released

Apple's new Center Stage camera uses AI to keep you perfectly framed during video calls

Enchanted by the sleek new iPhone Air’s ultra-thin design and features

SpaceX Invests $17 Billion to Revolutionize Starlink Direct-to-Cell Service

I’m drawn to Apple’s new iPhone Air, but the iPhone 17 offers more value

- 128 GB of GDDR7 memory for fast data access[3][4]. - H...