Elon Musk plans to expand the Colossus AI supercomputer tenfold


Unlock Editor’s Digest for free

Elon Musk’s artificial intelligence startup xAI has pledged to expand its Colossus supercomputer tenfold to include more than 1 million GPUs in a bid to outpace rivals such as Google, OpenAI and Anthropic.

Built in just three months earlier this year, Colossus is said to be the world’s largest supercomputer, running a cluster of more than 100,000 interconnected Nvidia GPUs. Chips are used for training in Musk the Grok chatbot, which is less advanced and has fewer users than market leader ChatGPT or Google’s Gemini.

Work has already begun to expand the size of the Memphis, Tennessee, facility, the Greater Memphis Chamber said in a statement Wednesday. NvidiaThe chamber said Dell and Supermicro Computer will also open operations in Memphis to support the expansion, as well as create an “xAI special operations group” to “provide 24/7 concierge services to the company.”

The cost of purchasing that many GPUs would be significant. The latest generation of Nvidia GPUs typically cost tens of thousands of dollars, although older versions of the chips can be cheaper. Musk’s planned Colossus expansion will require an investment that could reach tens of billions of dollars — plus the high cost of building, powering and cooling the massive servers that will host them. xAI has raised about $11 billion in capital from investors this year.

AI companies are looking to provide GPUs and access to data centers to provide the computing power needed to train and run their advanced large-scale language models.

OpenAI, maker of ChatGPT, has a nearly $14 billion partnership with Microsoft that includes computing power credits. Anthropic, maker of the Claude chatbot, has received $8 billion in investment from Amazon and will soon access granted into a new cluster of more than 100,000 specialized AI chips.

Instead of building partnerships, Musk, the world’s richest man, has used his power and influence in the tech sector to build his own supercomputing powerhouse, though he’s catching up after founding xAI just over a year ago. The trajectory has been steep, with the startup valued at $45 billion and recently raising another $5 billion.

Musk is in fierce competition with OpenAI, which he helped co-found with Sam Altman in 2015. They subsequently fell out, and now Musk is suing OpenAI, trying to block its transition from a non-profit to a more traditional organization.

An xAI investor said the speed with which Musk built Colossus was a “feather in the cap” of the AI ​​company, despite having limited commercial product offerings. “He built the world’s most powerful supercomputer in three months.”

Jensen Huang, Nvidia’s chief executive, said in October that “there was only one person in the world who could do this.” Huang called Colossus “the fastest single-cluster supercomputer on the planet” and said it would normally take three years to build a data center of this size.

The Colossus project was controversial because of the speed with which it was built. Some have accused him of circumventing planning permissions and criticized the demands he places on the region’s electricity grid.

“We are not just leading from the front; we are accelerating progress at an unprecedented rate while ensuring network stability using megapack technology,” Brent Mayo, xAI’s senior manager of site assembly and infrastructure, said at the event in Memphis, according to a statement.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *