Nvidia’s Jetson Xavier Stuffs Volta Performance Into Tiny Form Factor

Nvidia’s Jetson Xavier Stuffs Volta Performance Into Tiny Form Factor

This week, Nvidia unveiled its new Jetson Xavier platform, a new computation board with significantly higher performance than the previous models from Team Green. Up until now, the company has offered the Jetson TK1 (2014), TX1 (2015) and Jetson TX2 (2017) as edge compute devices for AI workloads. The K1 was built around Kepler, the X1 used Maxwell, X2 is based on Pascal, and the Xavier is, as one might expect, based on Volta.

The new board packs 512 GPU cores (TX1 and TX2 were both 256-core solutions) with an eight-core ARM CPU of unspecified vintage. Nvidia has not clarified if this is a further evolution and refinement of its Denver CPU core, or if the company is using a bog-standard ARM Cortex design. Nvidia mentions ARM8.2, which is interesting, because ARM’s own blog states that 8.2 included support for an “enhanced memory model, half-precision floating point data processing and introduces both RAS (reliability availability serviceability) support and statistical profiling extension (SPE).” Scalable vector extensions (SVE) are also now supported in that amended instruction set.

Nvidia’s Jetson Xavier Stuffs Volta Performance Into Tiny Form Factor

Other upgrades from TX2 to Jetson Xavier include double the RAM (8GB to 16GB) at more than double the bandwidth (59.7GB/s to 137GB/s) and a pair of new Nvidia-specific deep learning accelerators. The NVDLA is described as an inference-processing solution for various types of machine learning workloads at Nvidia’s NVDLA.org site (available here). The exact text states:

NVDLA hardware is comprised of the following components:

Convolution Core – optimized high-performance convolution engine.Single Data Processor – single-point lookup engine for activation functions.Planar Data Processor – planar averaging engine for pooling.Channel Data Processor – multi-channel averaging engine for advanced normalization functions.Dedicated Memory and Data Reshape Engines – memory-to-memory transformation acceleration for tensor reshape and copy operations.

The same report notes that the configurations are modular and intended to be adjusted depending on the needs of the customer, so it’s not clear exactly which solution Nvidia is shipping with Xavier (the company’s documentation goes through two examples, a “small” and “large” Nvidia NVDLA model).

Nvidia is specifying that their Xavier board can stretch to fit a variety of usage models at TDPs ranging from 10W to 30W, with claims that the platform can hit 10 TFLOPS of FP16 and 20 TOPS using INT8. FP32 performance is 5 TFLOPS. The board is a significant step forward for Nvidia’s overall AI and ML performance in this form factor and comes on the heels of announcements like the HGX-2 — a much larger, ‘big iron’ server configuration intended for labs with far more cash to drop and more power to burn. The HGX-2 can draw 10 kilowatts which, as Next Platform notes, is a bit of a gamechanger for this kind of workload and capability. At 30W, the Jetson Xavier is intended for much more modest uses and platforms, where it still brings far more performance to the table than its predecessor.

Continue reading

Samsung Stuffs 1.2TFLOP AI Processor Into HBM2 to Boost Efficiency, Speed
Samsung Stuffs 1.2TFLOP AI Processor Into HBM2 to Boost Efficiency, Speed

Samsung has developed a new type of processor-in-memory, built around HBM2. It's a new achievement for AI offloading and could boost performance by up to 2x while cutting power consumption 71 percent.

How to Stop LG From Stuffing Ads Into Your Brand New OLED TV
How to Stop LG From Stuffing Ads Into Your Brand New OLED TV

LG thinks you should have to watch ads to use its TVs. We disagree.

Graphene Could Radically Improve Hard Drives, If We Could Only Make the Stuff
Graphene Could Radically Improve Hard Drives, If We Could Only Make the Stuff

Researchers have discovered that graphene could be phenomenal for hard drives, but we'll have to solve a lot of problems with the material before that can happen.

Wildlife Photographer of the Year’s ‘Stuffed Animal’ Image Disqualified
Wildlife Photographer of the Year’s ‘Stuffed Animal’ Image Disqualified

Think you can cheat to win a prestigious photo competition? You might want to use your own stuffed anteater or tame wolf instead of borrowing ones that are well-known.