NVIDIA's next generation of AI supercomputer chips is here
They're more capable, of course, but they're also geared more toward AI and LLMs.
NVIDIA has launched its next-generation of AI supercomputer chips that will likely play a large role in future breakthroughs in deep learning and large language models (LLMs) like OpenAI's GPT-4, the company announced. The technology represents a significant leap over the last generation and is poised to be used in data centers and supercomputers — working on tasks like weather and climate prediction, drug discovery, quantum computing and more.
The key product is the HGX H200 GPU based on NVIDIA's "Hopper" architecture, a replacement for the popular H100 GPU. It's the company's first chip to use HBM3e memory that's faster and has more capacity, thus making it better suited for large language models. "With HBM3e, the NVIDIA H200 delivers 141GB of memory at 4.8 terabytes per second, nearly double the capacity and 2.4x more bandwidth compared with its predecessor, the NVIDIA A100," the company wrote.
In terms of benefits for AI, NVIDIA says the HGX H200 doubles inference speed on Llama 2, a 70 billion-parameter LLM, compared to the H100. It'll be available in 4- and 8-way configurations that are compatible with both the software and hardware in H100 systems. It'll work in every type of data center, (on-premises, cloud, hybrid-cloud and edge), and be deployed by Amazon Web Services, Google Cloud, Microsoft Azure and Oracle Cloud Infrastructure, among others. It's set to arrive in Q2 2024.
NVIDIA's other key product is the GH200 Grace Hopper "superchip" that marries the HGX H200 GPU and Arm-based NVIDIA Grace CPU using the company's NVLink-C2C interlink. It's designed for supercomputers to allow "scientists and researchers to tackle the world’s most challenging problems by accelerating complex AI and HPC applications running terabytes of data," NVIDIA wrote.
The GH200 will be used in "40+ AI supercomputers across global research centers, system makers and cloud providers," the company said, including from Dell, Eviden, Hewlett Packard Enterprise (HPE), Lenovo, QCT and Supermicro. Notable among those is HPE's Cray EX2500 supercomputers that will use quad GH200s, scaling up to tens of thousands of Grace Hopper Superchip nodes.
Perhaps the biggest Grace Hopper supercomputer will be JUPITER, located at the Jülich facility in Germany, which will become the "world’s most powerful AI system" when it's installed in 2024. It uses a liquid-cooled architecture, "with a booster module comprising close to 24,000 NVIDIA GH200 Superchips interconnected with the NVIDIA Quantum-2 InfiniBand networking platform," according to NVIDIA.
NVIDIA says JUPITER will help aid scientific breakthroughs in a number of areas, including climate and weather prediction, generating high-resolution climate and weather simulations with interactive visualization. It'll also be employed for drug discovery, quantum computing and industrial engineering. Many of these areas use custom NVIDIA software solutions that ease development but also make supercomputing groups reliant on NVIDIA hardware.
The new technologies will be key for NVIDIA, which now makes most of its revenue from the AI and data center segments. Last quarter the company saw a record $10.32 billion in revenue in that area alone (out of $13.51 billion total revenue), up 171 percent from a year ago. It no doubt hopes the new GPU and superchip will help continue that trend. Just last week, NVIDIA broke its own AI training benchmark record using older H100 technology, so its new tech should help it extend that lead over rivals in a sector it already dominates.