Nvidia Built One of the Most Powerful AI Supercomputers in 3 Weeks

Autonomous vehicles: they’re not perfect and sometimes they kill people. But they also hold the promise of safer transportation—and far fewer jobs—in the relatively near future. To help these vehicles upgrade their intelligence from causing fatal accidents to preventing them, Nvidia created the DGX SuperPod, an AI-optimized supercomputer that can help design a better self-driving car.
Nvidia made it very clear it wants to be amongst the leaders in artificial intelligence and decided to build a supercomputer to demonstrate that. It only took the company three weeks to build by connecting 96 Nvidia DGX-2H supercomputers with Mellanox interconnect technology. You can actually buy it, too, if the novelty of your sixth yacht has worn off and you have $435,000 burning a hole in your pocket. That’s how much one DGX-2H costs at list price. The DGX SuperPod uses 96 of them. If you want to make a self-driving car, it seems Nvidia thinks it’ll cost somewhere in the ballpark of $41,760,000 to get started with the best hardware. Clearly, these systems were designed for large corporations.

With 1,536 Nvidia V100 Tensor Core GPUs, the SuperPod packs a lot of power for a relatively small system (by supercomputer standards). Nevertheless, if you want people to invest in anything that expensive you might want to prove that it’s up to the toughest of tasks. That’s why Nvidia decided to make its SuperPod build assist in solving one of the most difficult problems in AI.
Autonomous vehicles require an enormous amount of training data compared with technologies that use similar image classification models for other purposes (e.g. diagnostic medicine). The AI in a self-driving car isn’t looking for something specific and it needs to consider all of its surroundings and understand them well enough to safely function. That amounts in approximately one terabyte of data per vehicle per hour and the AI that powers autonomous vehicles needs to retrain itself continuously over time using data from an entire fleet. Nvidia decided to demonstrate how its SuperPod can help expedite the processing of training data measured in petabytes:
The system is hard at work around the clock, optimizing autonomous driving software and retraining neural networks at a much faster turnaround time than previously possible. For example, the DGX SuperPod hardware and software platform takes less than two minutes to train ResNet-50. When this AI model came out in 2015, it took 25 days to train on the then state-of-the-art system, a single Nvidia K80 GPU. DGX SuperPOD delivers results that are 18,000x faster. While other TOP500 systems with similar performance levels are built from thousands of servers, DGX SuperPOD takes a fraction of the space, roughly 400x smaller than its ranked neighbors.
While the DGX-2H happens to perform best using ResNet-50, those numbers would remain impressive when scaled for just about any image classification model. You should expect impressive performance from a multi-million dollar system, but accomplishing that at such a small (relative) size it’s clear why Nvidia has remained dominant the AI hardware market.
What kind of advancements might such capable hardware lead to? Nvidia demonstrated that as well with a new and more accurate method of calculating distance from objects in three-dimensional space so autonomous vehicles can more easily prevent collisions.
Nvidia explains some of the key areas where this new approach helps to improve safety:
[W]e use convolutional neural networks and data from a single front camera. The DNN is trained to predict the distance to objects by using radar and lidar sensor data as ground-truth information. Engineers know this information is accurate because direct reflections of transmitted radar and lidar signals pro precise distance-to-object information, regardless of a road’s topology. By training the neural networks on radar and lidar data instead of relying on the flat ground assumption, we enable the DNN to estimate distance to objects from a single camera, even when the vehicle is going up or down hill.
It seems like Nvidia has some new advancement in artificial intelligence on almost a weekly basis—even if it’s just a more efficient method of turning your dog into a lion. Perhaps that’s because the company can build one of the most efficiently powerful supercomputers in the world in three weeks. When you don’t have to wait very long to process an enormous amount of data, you can quickly test new ideas and find the optimal solution a lot faster. You just need around $40 million dollars to get started.
Top image credit: Nvidia
Continue reading

Europe Plans 20,000 GPU Supercomputer to Create ‘Digital Twin’ of Earth
The plan to create a digital twin of Earth might end up delayed due to the relative lack of available GPUs, but this isn't going to be an overnight project.

Tesla Built a Supercomputer to Develop Camera-Only Self-Driving Tech
Tesla is talking about what it sees as the next leap in autonomous driving that could do away with lidar and radar, leaving self-driving cars to get around with regular optical cameras only.

Meta is Building a Massive New Supercomputer
It'll be used for real-time speech recognition, neuro-linguistic programming . . . and the metaverse, obviously.

AMD-Powered Supercomputer is The First to Break The Exascale Barrier
Insert obligatory "Can it run Crysis?" joke here.