Nvidia’s Jetson Xavier Stuffs Volta Performance Into Tiny Form Factor

Nvidia’s Jetson Xavier Stuffs Volta Performance Into Tiny Form Factor

This week, Nvidia unveiled its new Jetson Xavier platform, a new computation board with significantly higher performance than the previous models from Team Green. Up until now, the company has offered the Jetson TK1 (2014), TX1 (2015) and Jetson TX2 (2017) as edge compute devices for AI workloads. The K1 was built around Kepler, the X1 used Maxwell, X2 is based on Pascal, and the Xavier is, as one might expect, based on Volta.

The new board packs 512 GPU cores (TX1 and TX2 were both 256-core solutions) with an eight-core ARM CPU of unspecified vintage. Nvidia has not clarified if this is a further evolution and refinement of its Denver CPU core, or if the company is using a bog-standard ARM Cortex design. Nvidia mentions ARM8.2, which is interesting, because ARM’s own blog states that 8.2 included support for an “enhanced memory model, half-precision floating point data processing and introduces both RAS (reliability availability serviceability) support and statistical profiling extension (SPE).” Scalable vector extensions (SVE) are also now supported in that amended instruction set.

Nvidia’s Jetson Xavier Stuffs Volta Performance Into Tiny Form Factor

Other upgrades from TX2 to Jetson Xavier include double the RAM (8GB to 16GB) at more than double the bandwidth (59.7GB/s to 137GB/s) and a pair of new Nvidia-specific deep learning accelerators. The NVDLA is described as an inference-processing solution for various types of machine learning workloads at Nvidia’s NVDLA.org site (available here). The exact text states:

NVDLA hardware is comprised of the following components:

Convolution Core – optimized high-performance convolution engine.Single Data Processor – single-point lookup engine for activation functions.Planar Data Processor – planar averaging engine for pooling.Channel Data Processor – multi-channel averaging engine for advanced normalization functions.Dedicated Memory and Data Reshape Engines – memory-to-memory transformation acceleration for tensor reshape and copy operations.

The same report notes that the configurations are modular and intended to be adjusted depending on the needs of the customer, so it’s not clear exactly which solution Nvidia is shipping with Xavier (the company’s documentation goes through two examples, a “small” and “large” Nvidia NVDLA model).

Nvidia is specifying that their Xavier board can stretch to fit a variety of usage models at TDPs ranging from 10W to 30W, with claims that the platform can hit 10 TFLOPS of FP16 and 20 TOPS using INT8. FP32 performance is 5 TFLOPS. The board is a significant step forward for Nvidia’s overall AI and ML performance in this form factor and comes on the heels of announcements like the HGX-2 — a much larger, ‘big iron’ server configuration intended for labs with far more cash to drop and more power to burn. The HGX-2 can draw 10 kilowatts which, as Next Platform notes, is a bit of a gamechanger for this kind of workload and capability. At 30W, the Jetson Xavier is intended for much more modest uses and platforms, where it still brings far more performance to the table than its predecessor.

Continue reading

How to Build a Face Mask Detector With a Jetson Nano 2GB and AlwaysAI
How to Build a Face Mask Detector With a Jetson Nano 2GB and AlwaysAI

Nvidia continues to make AI at the edge more affordable and easier to deploy. So instead of simply running through the benchmarks to review the new Jetson Nano 2GB, I decided to tackle the DIY project of building my own face mask detector.

Event Horizon Telescope Captures Never-Before-Seen Detail of Black Hole Jets
Event Horizon Telescope Captures Never-Before-Seen Detail of Black Hole Jets

You've probably seen images of Centaurus A in the past, as it's one of the brightest galaxies in the sky. You've never seen it like this, though.

Nvidia’s Jetson AGX Orin Packs an AI Punch in a Small Package
Nvidia’s Jetson AGX Orin Packs an AI Punch in a Small Package

While physics to some extent predicted by "Moore's Law" has slowed down the rate of progress in some aspects of computing, GPUs continue to improve at an impressive rate. Nvidia's new Jetson AGX Orin is the same size as the older Xavier, but packs 2-8x the AI horsepower, depending on the application. We got our hands on one and put it through its paces.

American Airlines Orders 20 Supersonic Overture Jets
American Airlines Orders 20 Supersonic Overture Jets

Neither company released specifics of the deal, except to say American Airlines has paid a "non-refundable deposit."