Nvidia Unveils ‘Grace’ Deep-Learning CPU for Supercomputing Applications

Nvidia Unveils ‘Grace’ Deep-Learning CPU for Supercomputing Applications

Nvidia thinks it’s time for traditional CPUs to step aside when it comes to tackling the largest machine learning tasks, especially training huge models that are now upwards of a trillion parameters. Conventional super-computers make use of specialized processors — often GPUs — to do much of the compute-intensive math during training, but GPUs typically can’t host nearly the amount of memory needed, let alone share it quickly in multi-GPU configurations. So the machine is often bottlenecked by the speed of getting data from CPU memory to the GPU and back.

With Nvidia’s new Grace deep-learning CPU, which the company unveiled today, supercomputer GPUs get both much faster access to CPU memory and much better aggregate bandwidth when multiple GPUs are paired with a single CPU. Nvidia says the device required 10,000 engineering years of work, but given their gift for hyperbole, we’re not sure what they’re including. And as you may have already guessed, the ARM-based CPU is named after Grace Hopper, an early computing pioneer.

Key to Grace’s performance gain is Nvidia’s NVLink interconnections between the CPU and multiple GPUs. Nvidia says that it can move 900 GB/second over NVLink, which is many times more bandwidth than is typically available between CPU and GPU. The CPU memory itself is also optimized, as Grace will use LPDDR5x RAM for the CPU. Grace will also feature a unified memory space and cache coherence between the CPU and GPUs, which should make GPGPU programming less of a headache. It will support Nvidia’s CUDA and CUDA-X libraries, along with its HPC software development kit.

Overall, Nvidia says that a Grace-powered system will be able to train large models (think a trillion parameters and up), and run similar-sized simulations as much as 10 times faster than a similar system using x86 CPUs. Nvidia was careful to stress, though, that it doesn’t see Grace displacing x86 processors in applications smaller than that.

Nvidia Unveils ‘Grace’ Deep-Learning CPU for Supercomputing Applications

As exciting as the potential for Grace is, it will be a while before anyone will get to work with one. Nvidia expects the chip to be available in 2023. The company also announced that the Swiss Supercomputing Center and the Los Alamos National Laboratory are planning to build massive new supercomputers using Grace CPUs. The machines will be built by HP Enterprise, and are expected to come online that same year. Both customers are excited that the new machines will allow them to analyze larger datasets than before, and improve the performance of their scientific computing software.

Continue reading

Nvidia Unveils Ampere A100 80GB GPU With 2TB/s of Memory Bandwidth
Nvidia Unveils Ampere A100 80GB GPU With 2TB/s of Memory Bandwidth

Nvidia announced an 80GB Ampere A100 GPU this week, for AI software developers who really need some room to stretch their legs.

Star Citizen Developer Unveils New Roadmap, Cancels Squadron 42 Beta
Star Citizen Developer Unveils New Roadmap, Cancels Squadron 42 Beta

Cloud Imperium Games has canceled the Squadron 42 beta that was supposed to debut before the end of 2020, with no current plan or timeline for launching it.

Mercedes-Benz Unveils 56-Inch ‘Hyperscreen’ Dashboard Panel
Mercedes-Benz Unveils 56-Inch ‘Hyperscreen’ Dashboard Panel

Ahead of the now-virtual CES 2021, Mercedes-Benz has unveiled the MBUX Hyperscreen, a 56-inch-wide, curved cinematic display that stretches across the entire dashboard, from the left air vent to the right.

Framework Unveils Modular Laptop Coming This Summer 2021
Framework Unveils Modular Laptop Coming This Summer 2021

Laptop startup Framework believes it's finally cracked the code on customizable mobile I/O.