Nvidia Announces New Tesla T4 GPUs For Data Center Inferencing

With Turing fast approaching for consumer cards, Nvidia is bringing new GPUs to market for data center and the HPC universe as well. Last week, the company announced its new T4 GPU family, specifically intended for AI and inference workloads and taking over for the Tesla P4 in this role.
Nvidia claims the new GPU is up to 12x more power-efficient than its Pascal predecessor. The company has released a suite of benchmark tests showing the T4 blasting past its competition, though as always, such vendor results should be treated with a grain of salt. We’ve seen Intel release test results claiming its own Xeon processors are excellent at inference, for example. The degree to which this is or is not true is likely the result of optimization flags and specific test configurations or scenarios.

Specs on the new T4 are impressive. 16GB of GDDR6 feeds a cluster of 2560 CUDA cores and 320 Turing Tensor cores, all within a svelte 75W power profile. THG reports that the Tesla T4 has an INT4 and even an experimental INT1 mode, with up to 65TFLOPS of FP16, 130 TFLOPS of INT8, and 260 TFLOPS of INT4 performance on-tap. The older P4, in contrast, offers 5.5TFLOPS of FP16 and 22 TFLOPS of INT8. Nvidia says there are optimizations for AI video applications as well and a buffed-up decoder that can handle up to 38 HD video streams simultaneously.

It’s not always clear how these technologies will impact consumers; Nvidia’s push to introduce ray tracing and DLSS are the most prominent example we have so far of a company working to take the designs it’s built for HPC and bringing them over to the consumer space. We don’t yet know if it’ll work. But there’s clearly a multi-way fight brewing between the largest titans of the industry — and Nvidia wants to take an early leadership position with its new line of GPUs.
Continue reading

MLPerf Releases First Results From AI Inferencing Benchmark
New data from the MLPerf benchmark is available for a wide variety of devices. Building new AI/ML benchmarks is essential to testing the performance of these devices.