With 100,000 NVIDIA GPUs, DDN’s high-efficiency data platform enables Grok to push the limits of natural language processing and AI inference at an unprecedented scale.
CHATSWORTH, Calif., Nov. 18, 2024 /PRNewswire/ — DDN®, a leading force in AI data intelligence, proudly announces a collaboration with NVIDIA to drive xAI’s Project Colossus in Memphis, Tennessee. This collaboration is a cornerstone in xAI’s bold vision to expand AI’s potential, driving Grok. Initially fueled by a combination of 100,000 NVIDIA Hopper GPUs and the NVIDIA Spectrum-X Ethernet networking platform, the solution maintains a 95% data throughput efficiency level during massive AI training. Colossus will soon scale to 200,000 GPUs, cementing its place as one of the world’s most powerful AI supercomputers and advancing the limits of what AI can achieve.
The Memphis facility, now a true data metropolis stretching across multiple data halls, has been designed to satisfy Grok’s requirement for speed, scale, and raw computational power. Think of this infrastructure as converting a high-rise into a bustling hub, fully optimized to support one of the world’s most powerful AI engines. At its core, DDN’s advanced AI data platform, turbocharged by the NVIDIA accelerated computing platform, combines the power of DDN’s EXAScaler and Infinia solutions. This setup delivers the scale and precision that cutting-edge AI demands—an engine fine-tuned for extreme efficiency and designed to handle intensive generative AI workloads.
DDN’s platform, designed for organizations to scale model training and inference, allows data to flow smoothly and efficiently, thanks to its streamlined DataPath technology. This setup maximizes data movement without the usual strain on hardware, power, cooling, or network resources, enabling xAI to expand Colossus’ training capabilities while keeping costs down and minimizing environmental impact. The result is a supercomputer that is as efficient as it is powerful.
Leaders on the Cutting Edge:
“By powering DDN’s platform with NVIDIA’s accelerated computing platform, we are equipping xAI with the technology needed to advance its most ambitious AI projects,” said Alex Bouzari, CEO and co-founder of DDN. “Our solutions are specifically engineered to drive efficiency at massive scale, and this deployment at xAI perfectly demonstrates the capabilities of our high-performance, AI-optimized technology.”
Elon Musk, CEO of xAI said on X: “Colossus is the most powerful AI training system in the world. Moreover, it will double in size to 200k (50k H200s) in a few months. Excellent work by the team, NVIDIA and our many partners/suppliers.”
“Powerful AI systems require cutting-edge performance and scalability to meet the increasing demands of frontier AI models,” said Dion Harris, director of accelerated data center product solutions at NVIDIA. “Complementing the power of 100,000 NVIDIA Hopper GPUs connected via the NVIDIA Spectrum-X Ethernet platform, DDN’s cutting-edge data solutions provide xAI with the tools and infrastructure needed to drive AI development at exceptional scale and efficiency, helping push the limits of what’s possible in AI.”
Unprecedented Training Power and Efficiency
Project Colossus, supercharged by DDN, sets a new benchmark in AI model training power and speed. Grok taps into the massive compute power of 100,000 GPUs, all seamlessly supported by DDN’s EXAScaler and Infinia solutions. DDN’s data platform drastically reduces training time, enabling rapid model iteration and greater flexibility for updates. With Colossus and DDN’s architecture, xAI can tackle larger datasets and increasingly complex model architectures, driving breakthrough performance in applications like natural language processing and conversational AI—all at a scale previously thought unachievable.
Powering Real-World AI Inference at Scale
Beyond training, DDN’s high-efficiency platform amplifies AI inference capabilities in Colossus, allowing xAI to deploy powerful models at scale. DDN’s streamlined data pathways boost inference speeds for real-time applications, ensuring Grok’s impact is felt directly by users across platforms like X. The enhanced performance Colossus achieves by leveraging DDN solutions primes Grok to become one of the most advanced AI systems available commercially, bringing AI-driven user experiences to new heights and setting the standard for speed and scalability in real-world applications.
DDN Enables AI Success at Three Critical Levels:
- Data Center & Cloud Optimization: DDN solutions deliver end-to-end optimization across compute, network, and storage for GPU workloads, drastically reducing overhead and inefficiencies by 75% compared to others. In large language models (LLMs), DDN achieves a 10x cost benefit by optimizing data loading, checkpointing, and inference in generative AI (GenAI). This means faster AI results, with lower costs, in a smaller footprint.
- AI Framework/LLM/GenAI Acceleration: DDN accelerates the analytics layer in AI workflows, often boosting LLM performance by up to 10x, even in constrained environments. This reduces GPU waste, speeds up training, and shortens time to market for AI products, providing a strong business advantage.
- Data Orchestration and Movement Optimization: The DDN platform ensures efficient data flow across edge, data center, and multi-cloud environments. By minimizing latency and reducing unnecessary data transfer, we cut costs and enhance scalability, creating a flexible, future-proof infrastructure for AI-driven innovation.
A Legacy of Collaboration with NVIDIA
For over seven years, DDN has been working with NVIDIA on supercomputing innovations, starting with the renowned Selene supercomputer. This collaboration grew to include support for the Eos supercomputer and now extends to the latest NVIDIA Blackwell platform.
About DDN
DDN is the world’s leading data intelligence company that provides an advantage to over 11,000 customers focused on unlocking real-time AI & HPC insights. The DDN Data Intelligence Platform supercharges more than 500,000 GPUs worldwide across a broad range of use cases, including autonomous driving, financial services, healthcare, research and academia. Manage complex data, enhance performance, deliver cost savings, increase security and accelerate your AI & HPC workloads at scale from edge to core to cloud.
Contact:
Press Relations at DDN
[email protected]
SOURCE DataDirect Networks (DDN)