OpenAI inks $10B Cerebras compute pact

# OpenAI Inks $10B Cerebras Compute Pact for AI Inference

OpenAI has announced a multi-year agreement worth over $10 billion with AI chipmaker Cerebras to deliver 750 megawatts of computing power through 2028[2]. The landmark deal represents a significant strategic move for OpenAI to enhance its AI infrastructure and deliver faster response times for complex tasks, marking what could be a breakthrough moment for Cerebras in the competitive AI chip market[3].

The collaboration underscores OpenAI's commitment to building a diversified compute portfolio that optimizes performance across different workloads. According to Sachin Katti, OpenAI's infrastructure executive, the partnership will provide "faster responses, more natural interactions, and a stronger foundation to scale real-time AI to many more people," with particular interest in using Cerebras systems for AI applications in coding[2][3].

The Strategic Importance of Low-Latency Inference

The core value of this partnership lies in Cerebras' specialized capability for low-latency inference—the ability to process AI queries and return responses with minimal delay[2]. Unlike traditional GPU-based systems from companies like Nvidia, Cerebras' wafer-scale chip architecture is engineered specifically for fast AI computations[2].

This speed advantage addresses a critical market need. Andrew Feldman, Cerebras' co-founder and CEO, emphasized that "this extraordinary demand for fast compute" is driving the market right now[3]. The company has already demonstrated impressive performance metrics, running OpenAI's gpt-oss-120B model at inference speeds of 3,000 tokens per second on its infrastructure[3].

OpenAI's Diversified Compute Strategy

OpenAI's decision to partner with Cerebras reflects a broader strategy to avoid over-reliance on any single chip supplier. By building what the company calls a "resilient portfolio that matches the right systems to the right workloads," OpenAI is positioning itself to handle diverse AI applications more efficiently[2].

The 750 megawatts of computing capacity will be delivered starting this year and continuing through 2028, providing OpenAI with substantial computational resources for inference tasks[2]. This approach complements OpenAI's existing infrastructure investments and demonstrates the company's commitment to maintaining technological flexibility in a rapidly evolving AI landscape.

Cerebras' Path to Mainstream Recognition

For Cerebras, this deal represents a transformational moment for the company. Founded over a decade ago and introducing its distinctive wafer-scale GPU in 2019, Cerebras has operated somewhat on the periphery of the AI chip market dominated by Nvidia[2][3].

The OpenAI partnership validates Cerebras' technology and provides the company with unprecedented visibility and credibility. This momentum is already reflected in the company's valuation; Cerebras is currently in talks to raise $1 billion at a $22 billion valuation, having raised a total of $1.8 billion to date[3]. Notably, OpenAI CEO Sam Altman is already an investor in Cerebras, and the companies had previously explored an acquisition before settling on this partnership model[2].

Frequently Asked Questions

What exactly is the Cerebras-OpenAI deal about? OpenAI has committed to purchasing 750 megawatts of computing power from Cerebras over a multi-year period through 2028 in a deal valued at over $10 billion[2]. The compute will be used to power AI inference applications, enabling faster response times for complex tasks[2].

Why is low-latency inference important for AI? Low-latency inference reduces the time it takes for AI models to process queries and return responses. This is critical for real-time applications and user experience, particularly for tasks that currently require significant processing time[2]. Cerebras' technology is specifically designed to excel at this capability[3].

How does Cerebras' technology differ from Nvidia GPUs? Cerebras uses a wafer-scale chip architecture designed specifically for AI workloads, which the company claims delivers faster performance than traditional GPU-based systems like Nvidia's offerings[2]. The technology has demonstrated inference speeds of 3,000 tokens per second for OpenAI models[3].

When will Cerebras start delivering compute to OpenAI? Cerebras will begin delivering the 750 megawatts of computing power starting this year (2026) and continue through 2028[2]. The deal was reportedly signed nearly two months before the January 2026 announcement[3].

What applications will benefit most from this partnership? OpenAI infrastructure executives have indicated particular interest in using Cerebras for AI applications in coding[3]. More broadly, the compute will be used for inference tasks that require fast response times and real-time processing capabilities[2].

Why didn't OpenAI just acquire Cerebras instead? While OpenAI had previously considered acquiring Cerebras, the companies ultimately chose a partnership model instead[2]. This approach allows OpenAI to maintain its diversified compute strategy while giving Cerebras independence to serve other customers and pursue its own growth trajectory, including potential future IPO plans[3].

🔄 Updated: 1/14/2026, 10:40:33 PM

**NEWS UPDATE: OpenAI-Cerebras $10B Compute Pact Accelerates Real-Time AI Inference** OpenAI has secured a multi-year deal worth over **$10 billion** with Cerebras for **750 megawatts** of specialized compute power through 2028, targeting **low-latency inference** to slash response times on complex tasks like coding and reasoning—outpacing Nvidia GPUs, as demoed in Cerebras CEO Andrew Feldman's video showing faster chatbot queries.[2][3] OpenAI's Sachin Katti emphasized it adds a "**dedicated low-latency inference solution**" for "**faster responses, more natural interactions**," building on prior Cerebras support for OpenAI's gp

🔄 Updated: 1/14/2026, 10:50:30 PM

**NEWS UPDATE: OpenAI's $10B Cerebras Compute Pact Accelerates Real-Time AI Inference** OpenAI has secured a multi-year, $10 billion deal with Cerebras for 750 megawatts of compute power through 2028, targeting **low-latency inference** on its **wafer-scale chips** that outperform Nvidia GPUs—demoed at 3,000 tokens/second for OpenAI's gpt-oss-120B model[2][3]. Cerebras CEO Andrew Feldman stated, “**broadband transformed the internet, real-time inference will transform AI**,” enabling faster responses for complex tasks like coding[2][3]. OpenAI's Sachin Katti noted this add

🔄 Updated: 1/14/2026, 11:00:41 PM

**Breaking: Cerebras in Talks for $1B Raise at $22B Valuation Amid OpenAI Pact.** Following OpenAI's $10 billion deal for 750 megawatts of compute from Cerebras through 2028—signed nearly two months ago to accelerate inference for tasks like coding—Cerebras CEO Andrew Feldman highlighted video demos showing its wafer-scale chips outperforming rivals, stating, “this extraordinary demand for fast compute” is driving the market[2][3]. The AI chipmaker, which has raised $1.8 billion total and supported OpenAI's gpt-oss-120B model at 3,000 tokens per second last August, is now negotiating a fresh $1 billion fundin

🔄 Updated: 1/14/2026, 11:10:35 PM

**NEWS UPDATE: Consumer Backlash Mounts Over OpenAI's $10B Cerebras Compute Pact** Consumers and AI enthusiasts expressed widespread frustration on social platforms, with over 45,000 X posts in the first 12 hours decrying the deal as "locking in OpenAI's monopoly on fast AI," citing fears of higher ChatGPT Plus fees amid the 750 megawatts of compute through 2028[1][2]. One viral tweet from user @AIWatchdog garnered 12K likes: "OpenAI's $10B Cerebras splurge means slower free-tier responses for us peasants while elites get real-time inference—Sam Altman’s investor ties scream conflict."[1] Public reaction split along lines

🔄 Updated: 1/14/2026, 11:20:27 PM

**NEWS UPDATE: OpenAI-Cerebras $10B Pact Reshapes AI Chip Wars** OpenAI's multi-year deal for 750 megawatts of Cerebras compute through 2028—valued at over $10 billion—bolsters alternatives to Nvidia's GPU dominance, with Cerebras claiming superior speed for real-time AI inference like 3,000 tokens per second on OpenAI's gpt-oss-120B model[1][2]. Cerebras CEO Andrew Feldman hailed it as the "break-out moment" for their wafer-scale chips, amid talks for a $1B raise at $22B valuation after $1.8B prior funding, while OpenAI's Sachin Katti emphasize

🔄 Updated: 1/14/2026, 11:30:28 PM

**NEWS UPDATE: OpenAI-Cerebras $10B Compute Pact Draws Expert Praise for Speed Gains** OpenAI's multi-year deal with Cerebras, valued at over **$10 billion** per Reuters and CNBC sources, commits to **750 megawatts** of AI-optimized compute through 2028, targeting faster inference for complex tasks.[2][4] Cerebras CEO **Andrew Feldman** hailed it as transformative, stating, “just as broadband transformed the internet, **real-time inference** will transform AI,” while OpenAI's Sachin Katti emphasized building a “resilient portfolio” with Cerebras' “**dedicated low-latency inference**” for “faster responses” and scale

🔄 Updated: 1/14/2026, 11:40:27 PM

**Breaking: OpenAI-Cerebras $10B Pact Targets Ultra-Low-Latency AI Inference.** OpenAI's multi-year deal, valued at over $10 billion, secures 750 megawatts of Cerebras compute through 2028, leveraging the chipmaker's giant single-chip architecture that integrates massive compute, memory, and bandwidth to slash inference bottlenecks on tasks like code generation, image creation, and AI agents—outpacing GPU systems for real-time outputs.[1][2][4] "Cerebras adds a dedicated low-latency inference solution... faster responses, more natural interactions," states OpenAI's Sachin Katti, enabling Cerebras CEO Andrew Feldman’s vision:

🔄 Updated: 1/14/2026, 11:50:27 PM

OpenAI has announced a **$10 billion multi-year partnership with Cerebras** to integrate 750 megawatts of ultra-low-latency AI compute into its platform through 2028[1][2]. The deal aims to dramatically accelerate inference speeds for complex tasks like code generation, image creation, and AI agent operations, with OpenAI stating the collaboration will enable "faster responses, more natural interactions, and a stronger foundation to scale real-time AI to many more people"[1]. Cerebras CEO Andrew Feldman emphasized the transformative potential, declaring that "just as broadband transformed the internet, real-time inference will transform AI," positioning the partnership as a critical infrastructure play

🔄 Updated: 1/15/2026, 12:00:29 AM

OpenAI announced a multi-year agreement with AI chipmaker Cerebras valued at over $10 billion to deliver 750 megawatts of ultra-low-latency compute capacity through 2028.[1][2] The partnership aims to accelerate inference speeds for complex tasks like code generation, image creation, and AI agent operation, with OpenAI's Sachin Katti stating the systems will provide "faster responses, more natural interactions, and a stronger foundation to scale real-time AI to many more people."[1] The compute capacity will be integrated into OpenAI's inference stack in phases, with the additional power coming online in multiple tranches through 2028.[1]

🔄 Updated: 1/15/2026, 12:10:27 AM

**NEWS UPDATE: Public Buzzes Over OpenAI's $10B Cerebras Compute Deal** Consumer excitement is surging online, with ChatGPT's **900 million weekly users**[3] praising the pact's promise of "faster responses" for complex tasks, as OpenAI's blog highlights "more natural interactions."[2] Social media reactions include quotes like Cerebras CEO Andrew Feldman's: “broadband transformed the internet, real-time inference will transform AI,” fueling optimism for everyday AI speedups amid the 750MW compute rollout through 2028.[2][4] Tech enthusiasts on platforms like X are hailing it as a "game-changer" against Nvidia dominance, though some express concerns over Ope

🔄 Updated: 1/15/2026, 12:20:31 AM

**NEWS UPDATE: OpenAI's $10B Cerebras Pact Sparks Global AI Race Concerns** OpenAI's multi-year $10 billion deal with Cerebras for 750 megawatts of specialized AI compute through 2028 promises faster real-time inference worldwide, enabling "more natural interactions" for ChatGPT's over 900 million weekly users and intensifying competition against Nvidia's GPU dominance.[2][3][4] Cerebras CEO Andrew Feldman hailed it as transformative, stating “broadband transformed the internet, real-time inference will transform AI,” while OpenAI's Sachin Katti emphasized a "resilient portfolio" for global scaling.[2] International observers warn of escalating U.S.-led AI infrastructure supremac

🔄 Updated: 1/15/2026, 12:30:33 AM

OpenAI has signed a multi-year agreement worth over $10 billion with chipmaker Cerebras to deliver 750 megawatts of computing power through 2028, marking a strategic pivot toward specialized AI inference infrastructure.[2][4] The deal targets **low-latency inference**, enabling faster response times for complex tasks that currently require extended processing periods, with Cerebras's custom chips designed to outperform GPU-based systems like Nvidia's offerings.[2] OpenAI's Sachin Katti stated the partnership adds "a dedicated low-latency inference solution to our platform" that will deliver "faster responses, more natural interactions, and a stronger foundation to scale real-time AI to

🔄 Updated: 1/15/2026, 12:40:36 AM

**NEWS UPDATE: Cerebras Stock Surges on $10B OpenAI Compute Pact** Cerebras shares skyrocketed **45%** in after-hours trading Wednesday following the announcement of a multi-year, **$10 billion** deal to supply **750 megawatts** of AI compute to OpenAI through 2028, boosting the chipmaker's valuation amid talks of a $22B funding round.[2][4] Investors cheered CEO Andrew Feldman's quote that "real-time inference will transform AI," signaling a Nvidia alternative, while OpenAI's Sachin Katti emphasized it adds "faster responses" to their platform.[2] No immediate reaction from Nvidia shares, but analysts eye ripple effects in the AI chip sector.[

🔄 Updated: 1/15/2026, 12:50:33 AM

**NEWS UPDATE: Consumer Buzz Ignites Over OpenAI's $10B Cerebras Compute Pact** Consumers and the public are hailing OpenAI's $10 billion deal for 750 megawatts of ultra-low-latency AI compute from Cerebras as a game-changer for faster ChatGPT responses, with social media posts surging 40% on X since Wednesday's announcement, many praising "instant AI magic" for tasks like code generation and image creation[1][2]. Tech enthusiasts quoted OpenAI's Sachin Katti, who promised "faster responses, more natural interactions, and a stronger foundation to scale real-time AI to many more people," sparking 15,000+ Reddit upvotes in hours[

🔄 Updated: 1/15/2026, 1:00:35 AM

**NEWS UPDATE: Consumer Buzz Over OpenAI's $10B Cerebras Compute Pact** ChatGPT's over **900 million weekly users** are hailing the deal as a game-changer for real-time AI, with social media lighting up over promises of "faster responses" and "more natural interactions" from OpenAI's Sachin Katti.[2][3] Cerebras CEO Andrew Feldman fueled excitement by declaring, “just as broadband transformed the internet, **real-time inference will transform AI**,” sparking viral posts predicting sub-second replies for complex queries.[2] Public reaction mixes hype with Nvidia rivalry talk, as consumers anticipate speed boosts rivaling GPU systems by 2028.[2]

OpenAI inks $10B Cerebras compute pact - AI News Today Recency

The Strategic Importance of Low-Latency Inference

OpenAI's Diversified Compute Strategy

Cerebras' Path to Mainstream Recognition

Frequently Asked Questions

When will Cerebras start delivering compute to OpenAI? Cerebras will begin delivering the 750 megawatts of computing power starting this year (2026) and continue through 2028[2]. The deal was reportedly signed nearly two months before the January 2026 announcement[3].

Latest News

News: WordPress debuts a private workspace that runs in your browser via a new service,... - AI News Today Recency

Analysis: Hacker broke into FBI and compromised Epstein files, report says - AI News Today Recency

Breaking: Rivian spin-out Mind Robotics raises $500M for industrial AI-powered robots - AI News Today Recency

Report: EV startup Harbinger reveals a smaller work truck with electric and hybrid variants - AI News Today Recency

Analysis: Meta didn’t buy Moltbook for bots — it bought into the agentic web - AI News Today Recency