SC19: Intel Unveils New GPU Stack, oneAPI Development Effort

SC19: Intel Unveils New GPU Stack, oneAPI Development Effort

Intel made some significant announcements at Supercomputing 19 on Sunday, including new details on its Xe GPU architecture and a programming model it calls oneAPI. Both products are critical to the company’s future plans; Xe represents Intel’s first-ever push into data center GPUs and its first discrete GPU in nearly a decade. OneAPI is part of Intel’s effort to expand both its total addressable market and to unify the compute space developers use to target its products.

SC19: Intel Unveils New GPU Stack, oneAPI Development Effort

The goal of OneAPI is to present a single unified development target for the four major types of workloads (scalar, vector, matrix, spatial) and the various components that Intel manufactures (FPGAs, CPUs, GPUs, and other AI accelerators via products through companies like Movidius and Mobileye). One of the major goals of OneAPI is to abstract away the work of optimizing for any single specific architecture, allowing the developer to focus on writing code that runs on any underlying supported hardware.

SC19: Intel Unveils New GPU Stack, oneAPI Development Effort

The “write once, run anywhere” idea that Intel is going for with OneAPI is clearly reminiscent of Java, but there are some major differences between the two. Java compiles to bytecode and runs inside a JVM, while oneAPI is a set of libraries. Those libraries translate hardware-agnostic API calls into more specific low-level code that runs on whatever target hardware is present in the system. OneAPI isn’t completely without targeting — users are expected to define whether they’re writing code for an FPGA, CPU, or GPU, for example — but anything higher should be abstracted away.

Ponte Vecchio: Intel’s First Data Center GPU

Intel also unveiled details on Ponte Vecchio, its first data center and HPC GPU. Ponte Vecchio is a medieval bridge in Florence. It isn’t clear why Intel picked this particular naming convention; the company may have opted for famous bridges as a codename source. ServeTheHome has extensive details on Ponte Vecchio, which is optimized more towards compute workloads and less for graphics. The design uses variable vector width and can handle both SIMT and SIMD data, offering top performance when both modes are used.

SC19: Intel Unveils New GPU Stack, oneAPI Development Effort

PV can scale to thousands of EUs (firmer figures were not offered) and supports data types like INT8, bfloat16, and FP16. Xe is said to offer a 40x increase in double-precision floating point per execution unit compared with Intel’s existing integrated graphics. Xe will use CXL for a coherent interconnect between CPU and GPU. The GPU also includes something called a “Rambo” cache connected to the XEMF (Xe Memory Fabric).

Image by ServeTheHome
Image by ServeTheHome

Intel believes the cache is essential to its plan for improving performance when using large matrices. Intel’s new interconnects are both in play on this project, with EMIB used for HBM and Foveros used for Rambo. Ponte Vecchio will be built on Intel’s 7nm process. This may be the GPU that Intel expects to debut on that node when it’s ready for manufacturing.

OneAPI and Xe are both critical components of Intel’s broad future approach to computing. The company has articulated a multi-faceted future that leverages FPGAs, CPUs, GPUs, and other accelerators from the Loihi and NNP-I/NNP-T families to create an overall product ecosystem. We’ll start to see how those plays are coming together in 2020, as consumer Xe moves into production and next-generation products built on 10nm ship in greater volume.

Continue reading

VIA Technologies, Zhaoxin Strengthen x86 CPU Development Ties
VIA Technologies, Zhaoxin Strengthen x86 CPU Development Ties

VIA and Zhaoxin are deepening their strategic partnership with additional IP transfers, intended to accelerate long-term product development.

Qualcomm to Acquire Nuvia, Head Back Into Custom CPU Development
Qualcomm to Acquire Nuvia, Head Back Into Custom CPU Development

Qualcomm will buy Nuvia for $1.4B. The ARM CPU developer has been working on a server chip to challenge x86, but we don't know yet if Qualcomm will continue those plans.

Google Struggled to Patch New Stadia Game After Closing Development Studio
Google Struggled to Patch New Stadia Game After Closing Development Studio

Google's in-house game developers were supposed to lead the charge, but now most of them are out of work, and there's no one to issue prompt patches for a brand new game. It's just one more embarrassing misstep for Stadia.

Micron Ends 3D XPoint Development, Will Sell Its Optane Fab
Micron Ends 3D XPoint Development, Will Sell Its Optane Fab

Micron has announced it will no longer fab 3D XPoint and that it wishes to sell its Lehi fab. This is not good news for Optane as a whole.