One of Intel’s Recent Bug Fixes Carries a Performance Penalty

One of Intel’s Recent Bug Fixes Carries a Performance Penalty

When Intel released its report on 77 CPU bugs and security issues that it recently patched, it didn’t mention anything about any of the security fixes causing performance issues. As far as we know, there aren’t any performance implications for any of the patches we discussed in our previous story, as such — but there is a new Intel update that carries a small penalty hit.

Intel has discovered a separate erratum in its Skylake CPUs that has nothing to do with the Spectre or Meltdown issues we’ve previously discussed. We can’t fully discuss the issue because the website where one of the whitepapers is supposed to be isn’t currently serving up the paper. The other paper, which concerns mitigation efforts, is currently online and available here. The Jump Conditional Code (JCC) erratum is related to “complex microarchitectural conditions involving jump instructions that span 64-byte boundaries (cross cache lines).” According to Intel “the erratum may result in unpredictable behavior when certain multiple dynamic microarchitectural conditions are met.”

One of Intel’s Recent Bug Fixes Carries a Performance Penalty

Performance Impacts

According to Intel:

The JCC erratum MCU workaround will cause a greater number of misses out of the Decoded ICache and subsequent switches to the legacy decode pipeline. This occurs since branches that overlay or end on a 32-byte boundary are unable to fill into the Decoded ICache. Intel has observed performance effects associated with the workaround ranging from 0-4% on many industry-standard benchmarks.

In subcomponents of these benchmarks, Intel has observed outliers higher than the 0-4% range. Other workloads not observed by Intel may behave differently. Intel has in turn developed software-based tools to minimize the impact on potentially affected applications and workloads. The potential performance impact of the JCC erratum mitigation arises from two different sources:

1. A switch penalty that occurs when executing in the Decoded ICache and switching over to the legacy decode pipeline.

2. Inefficiencies that occur when executing from the legacy decode pipeline that are potentially hidden by the Decoded ICache.

Intel is working to fix this problem with toolchain and software updates and worked with Phoronix to get these changes into software so they could be evaluated. The remainder of the document is taken up with discussions of how to mitigate the issue and with details on which CPU families are affected. Affected chips include Amber Lake, Cascade Lake, Coffee Lake, Kaby Lake, Kaby Lake X, Skylake, and Whiskey Lake — so basically, everything back to Skylake. CPUs prior to Skylake are not affected, even though the cache changes that give rise to this error were introduced in Sandy Bridge.

According to Phoronix’s extensive tests, the average impact hits performance “by a couple of percent,” some of which can be recovered by compiler patches and updates to Linux that will take some time to be merged in updates and to trickle back down to users. It’s not clear what sort of timeline Windows users should expect or what performance losses look like in that operating system.

Data and graph by Phoronix
Data and graph by Phoronix

This single result from Phoronix shows the broad overall pattern of a performance loss with the initial microcode update, followed by partial recovery with the new patched code. There are other application tests that exceed the 4 percent threshold that Intel identified, but these appear to be outliers. The new microcode is sometimes faster than old microcode, period, and the patches that Intel has performed clearly aren’t finalized yet; there are still some places where the new code actually hits performance harder rather than helping. The point of publishing this data now, according to Phoronix, was to illustrate that the drops may be temporary.

Ultimately, this kind of move is likely the result of Intel cleaning house and conducting security reviews of its own products, then moving to patch errata, even those that might impact perf. That’s going to frustrate users who see performance dips, and the impact of these dips can exceed the 4 percent threshold, but it’s also the right move for the company to make long-term. Hopefully, updates to software toolchains and OS support will minimize the performance impact of these changes, which, again, appear to be unrelated to any of the issues we’ve discussed with Spectre and Meltdown.

It’s not yet clear if this fix is part of the bundle that Intel announced earlier on Tuesday, or if it will be delivered separately. Thus far, most of the performance impact of Spectre, Meltdown, and related fixes has hit server software more than client. Assuming that holds true, end-users should see few declines. If it doesn’t, you’ll hear about it here.

Continue reading

Intel’s Raja Koduri to Present at Samsung Foundry’s Upcoming Conference
Intel’s Raja Koduri to Present at Samsung Foundry’s Upcoming Conference

Intel's Raja Koduri will speak at a Samsung foundry event this week — and that's not something that would happen if Intel didn't have something to say.

Ryzen 9 5950X and 5900X Review: AMD Unleashes Zen 3 Against Intel’s Last Performance Bastions
Ryzen 9 5950X and 5900X Review: AMD Unleashes Zen 3 Against Intel’s Last Performance Bastions

AMD continues its onslaught on what was once Intel's undisputed turf.

Leaked Benchmarks Paint Conflicting Picture of Intel’s Rocket Lake
Leaked Benchmarks Paint Conflicting Picture of Intel’s Rocket Lake

Rumors about Rocket Lake have pointed in two opposite directions recently, but the more competitive figures are more likely to be true.

Intel’s Iris Xe Max Discrete GPU Is Slower Than the Integrated Version
Intel’s Iris Xe Max Discrete GPU Is Slower Than the Integrated Version

Intel's Iris Xe Max has debuted, but the discrete GPUs performance is a bit odd, as new benchmarks show.