Spectre and Meltdown are two of the most significant security issues to surface since the beginning of this millennium. Spectre, in particular, is going to be difficult to mitigate. Both AMD and Intel will have to redesign how their CPUs function to fully address the problem. Even if the performance penalties fall hardest on older CPUs or server workloads, instead of workstation, gaming, or general-purpose compute, there are going to be cases where certain customers have to eat a performance hit to close the security gap. All of this is true. But in the wake of these revelations, we’ve seen various people opining that the flaws meant the end of either the x86 architecture or, now, that it’s the final death knell for Moore’s law.
That’s the opinion of The Register, which has gloomily declared that these flaws represent nothing less than the end of performance improvements in general purpose compute hardware. Mark Pesce writes: “[F]or the mainstay of IT, general purpose computing, last month may be as good as it ever gets.”
A short-term decline in performance in at least some cases is guaranteed. But the longer-term case is more optimistic, I’d argue, than Pesce makes it sound.
Sharpening the Argument
Before we can dive into this any further, we need to clarify something. Pesce refers to this potential end of general compute performance improvements as the end of Moore’s Law, but that’s not really true. Moore’s Law predicts that transistor density will double every 18-24 months. The associated “law” that delivered the performance improvements that went hand-in-hand with Moore’s Law was known as Dennard Scaling, and it stopped working in 2005. Not coincidentally, that’s when frequency scaling slowed to a crawl as well.
Even as a metric for gauging density improvements, Moore’s Law has been reinvented multiple times by the semiconductor industry. In the 1970s and 1980s, higher transistor densities meant more functions could be integrated into a single CPU die.
Moore’s Law 2.0 focused on scaling up performance by sending clock speeds rocketing into the stratosphere. From 1978 to 1993, clock speeds increased from 5MHz (8086) to 66MHz (original Pentium), a gain of 13.2x in 15 years. From 1993 – 2003, clock speeds increased from 66MHz to 3.2GHz, an improvement of 48.5x in nine years. While the Pentium 4 Northwood wasn’t as efficient, clock-for-clock, as Intel’s older Pentium 3, it incorporated many architectural enhancements and improvements compared with the original Pentium, including support for SIMD instructions, an on-die full speed L2 cache, and out-of-order execution. This version of Moore’s Law essentially ended in 2005.
Moore’s Law 3.0 has focused on integrating other components. Initially, this meant additional CPU cores, at least in the desktop and laptop space. Later, as SoCs became common, it’s meant features like onboard GPUs, cellular and Wi-Fi radios, I/O blocks, and PCI Express lanes. This type of integration and density improvements in SoCs in general has continued apace and will not end at any point in the next few years, at least. The ability to deploy memory like HBM2 on a CPU package is a further example of how improving integration technology has improved overall system performance.
In short, it’s inaccurate to refer to Meltdown and Spectre ending “Moore’s Law.” But since references to Moore’s Law are still generally used as shorthand for “improved computer performance,” it’s an understandable usage and we’ll engage with the larger question.
Why Meltdown, Spectre, Aren’t the End of CPU Performance Improvements
This isn’t the first time CPU engineers have considered profound changes to how CPUs function in order to plug security holes or improve performance. The CISC (Complex Instruction Set Computing) CPUs of the 1960s to 1980s relied on single instructions that could execute a multi-step operation partly because both RAM and storage were extremely expensive, even compared with the cost of the processor itself.
As RAM and storage costs dropped and clock speeds increased, design constraints changed. Instead of focusing on code density and instructions that might take many clock cycles to execute, engineers found it more profitable to build CPUs with more general-purpose registers, a load/store architecture, and simpler instructions that could execute in one cycle. While x86 is officially considered a CISC architecture, all x86 CPUs translate x86 instructions into simplified, RISC-like micro-ops internally. It took years, but ultimately, RISC “won” the computing market and transformed it in the process.
The history of computing is definitionally a history of change. Spectre and Meltdown aren’t the first security patches that can impact performance; when Data Execution Prevention rolled out with Windows XP SP2 and AMD’s Athlon 64, there were cases where users had to disable it to make applications perform properly or at desired speed. Spectre in particular may represent a larger problem, but it’s not so large as to justify concluding there are few-to-no ways of improving performance in the future.
Furthermore, the idea that general purpose compute has stopped improving is inaccurate. It’s true that the pace of improvements has slowed and that games, in particular, don’t necessarily run faster on a Core i7-8700K than on a Core i7-2600K, despite the five years between them. But if you compare CPUs on other metrics, the gaps are different.
The following data is drawn from Anandtech’s Bench site, which allows users to compare results between various CPUs. In this case, we’re comparing the Ivy Bridge Core i7-3770K (Ivy Bridge) with the Core i7-6700 (Skylake). The 3770K had a 3.5GHz base and 3.9GHz boost clock, while the 6700 has a 3.4GHz base and 4GHz boost. That’s as close as we’re going to get when comparing clock-for-clock performance between two architectures (Ivy Bridge’s microarchitecture was identical to Sandy Bridge, with virtually no performance difference between them).
There are more results on Anandtech, including Linux data and game comparisons (which show much smaller differences). We picked a representative sample of these results to determine the average performance improvement between Ivy Bridge and Skylake based on Handbrake, Agisoft, Dolphin, WinRAR, x265 encoding, Cinebench, x264, and POV-Ray.
The average performance boost for Skylake was 1.18x over IVB in those eight applications, ranging from 1.07x in WinRAR to 1.38x in the first x264 Handbrake pass. There are tests where the two CPUs perform identically, but they’re not the norm outside of specific categories like gaming.
An 18 percent average improvement over several years is a far cry from the gains we used to see, but it isn’t nothing, either. And there’s no sign that these types of gains will cease in future CPU architectures. It may take a few years to shake these bugs off, particularly given that new CPU architectures take time to design, but the long-term future of general computing is brighter than it may appear. CPU improvements may have slowed, but there’s still some gas in the tank.
Moore’s Law may well pass into history as CMOS devices approach the nanoscale. Certainly there are some people who think it will, including Intel’s former chief architect and Gordon Moore himself. But if history is any indication, the meaning of the phrase is more likely to morph once again, to capture different trends still driving at the same goal — the long-term improvement of compute performance.