Once the M1 hit a few weeks back, it was clear that the diminutive processor was but a sign of things to come. Reports suggest that Apple will be upping the competitive ante in short order. The company plans to launch M1 follow-ups with up to 16 high-performance cores in 2021, targeting the MacBook Pro and iMac market. In 2022, it’ll launch machines with 32 high-performance cores in systems like the Mac Pro. While the 2021 CPU might still tap the FireStorm CPU core, it’s a good bet that the 2022 CPU will be at least one generation more advanced.
All of these rumors come from Bloomberg, which has a good record when it comes to Apple CPU coverage. The same report notes that Apple wants to bring laptop chips to market with 16-core and 32-core GPUs, and that the company is eyeing chips with 64 or 128 dedicated GPU cores. Each GPU manufacturer defines a “core” somewhat differently, so the fact that Apple is talking about a “128 core GPU” compared with, say, a 4096-core GPU from AMD or Nvidia, isn’t meaningful, but the rapid plan to scale up to higher levels of GPU performance is an effort to replace AMD GPU hardware the same way Intel will be pushed out of the CPU stack.
The near-term threat of the M1 is modest. Intel will lose some market share and Apple may snap some up. The vast majority of people who don’t-buy Macs today will continue to not-buy Macs in the immediate future.
The problem is not the M1. The twofold problem is that the M1 is merely the harbinger of future Apple CPUs built on more advanced architectures that will drive further up the product stack. The M1 is also a bellwether, demonstrating to other silicon design firms that it is possible to build an ARM chip that competes or outperforms an x86 CPU. If Apple’s M-class designs gain on Intel and AMD faster than Intel and AMD can pull away from them, Apple is going to gain ground. An estimated 2022 launch date is well in line with our own estimates of how long it could take Apple to challenge the top of the x86 product stack and/or exceed it.
As for the idea that the M1 could inspire other companies, consider the following hypothetical scenarios:
Scenario 1: Apple builds an ARM CPU just as fast as an x86 CPU, and is able to leverage its own ecosystem to deliver a modest improvement in performance per watt. The product compares excellently with x86 in low-power mobile, but fails to dislodge high-performance / high-power (35W TDP+) x86 products from market dominance.
Scenario 2: Apple builds an ARM CPU that’s dramatically faster than x86 CPU, both in terms of performance and power-per-watt. The gap is so large, previous non-Apple users begin switching to Apple. Qualcomm, Microsoft, Nvidia, and Samsung ally together to create new high-performance SoCs intended for Windows machines, to appeal to Windows users who do not want to use Apple software, but want access to at least some of this additional performance and power efficiency.
Whether we get Scenario #1 or Scenario #2 is going to depend on how large the gap is between future Mx CPUs and future x86 CPUs, but if we get #2, other companies are going to start looking to get in on the action. x86 CPUs command enormous premiums compared with mobile cores, even if they ship in much lower volumes.
AMD’s success since 2017 is proof that the tactic of offering more cores with better scaling at higher effective performance can pull customers away from a large, entrenched, competitor. Microsoft has signaled that the firm is open to designing its own silicon if the need arises, and while it has no plans to enter the CPU market itself, that would change if the only options were “Design its own silicon” or “Lose the Windows market.” Microsoft may not prioritize Windows as it used to, but Windows is still an enormous part of its earnings and core business. Android has proven to be no threat to the desktop and laptop space, but Apple’s new Macs aren’t ever going to ship with Windows installed — and that’s a threat to Microsoft’s earnings and market share.
All of this comes back to performance, and the performance figures — even ones adjusted for differences in CPU resources utilization and SMT — do not particularly favor x86.
The fact that the M1 needs fewer threads to hit its performance figures — as discussed in this story from earlier today — is a strength of the CPU, not a weakness. AMD already ships a 32-core Threadripper, but taking full advantage of every core requires an application that can scale up to 64 threads. Running two threads through one core to boost performance, as Intel and AMD both do, is a clever way to increase efficiency, but it requires that an application spin off enough threads to perform useful work and load a chip effectively. At a certain point, you run out of room to keep adding cores. The 3990X doesn’t scale well against the 3970X outside of rendering applications because Windows 10’s support for >64T is kludgy.
None of this is to say that AMD and Intel cannot answer the M1 or Apple more generally. Since 2017, AMD has set records for how quickly it has scaled Zen’s performance. Intel’s Tiger Lake is a marked improvement over Ice Lake and currently leads in mobile. By 2022, Intel will either have committed to using third-party foundries for leading-edge nodes or closing the manufacturing gap between itself and TSMC.
What I suspect, however, is that a major pivot in CPU designs is coming. AMD and Intel both hold ARM architectural licenses, and both firms are certain to be conducting deep analyses of the M1 and exactly how it achieves its performance. AMD additionally has the K12 — a CPU I’ve been told on several occasions was shelved rather than explicitly canceled. While not much is known about the core, one of its features was the ability to decode both x86 and ARM instructions.
I expect that by 2022, both AMD and Intel will have their own new technology deployments explicitly intended to draw down x86 power consumption, improve efficiency, and boost performance per watt. The M1’s appearance will have lit a fire under such efforts at both companies. Timelines at the top of the market are also longer — if Apple launches a highly-competitive-to-superior 32-core part in 2022, it’d probably be 2023 or 2024 before said chip began bleeding off market share — but the stakes are also higher.
Intel’s entire justification for foundry self-ownership rests on the sales of high-end Xeon and Core i7 / i9 CPUs. The loss of these markets would be disastrous for the firm’s financials. AMD has much more experience operating on low margins and absolutely no interest in returning to the days where the question wasn’t “I wonder how much profit AMD made this quarter,” but “I wonder if AMD managed to lose less than $500M?”
Both companies will fight tooth and nail for their own market share. Both enjoy benefits like long-term guaranteed back-compatibility, familiarity, and customer loyalty. The sheer size of the x86 ecosystem is its own bulwark, and the final outcome of the x86 versus ARM fight, long-term, is uncertain.
But here’s one thing I am certain of: The M1 will be considered an inflection point in the history of CPUs, if only because it’s the first real challenge to x86 hegemony in decades. AMD and Intel will have to improve their own designs to meet that challenge, and even if they do so successfully, the products they build afterward will continue on a different evolutionary path than they might have taken otherwise.
Intel’s Desktop TDPs No Longer Useful to Predict CPU Power Consumption
Intel's higher-end desktop CPU TDPs no longer communicate anything useful about the CPUs power consumption under load.
VIA Technologies, Zhaoxin Strengthen x86 CPU Development Ties
VIA and Zhaoxin are deepening their strategic partnership with additional IP transfers, intended to accelerate long-term product development.
Nvidia Unveils ‘Grace’ Deep-Learning CPU for Supercomputing Applications
Nvidia is already capitalizing on its ARM acquisition with a massively powerful new CPU-plus-GPU combination that it claims will speed up the training of large machine-learning models by a factor of 10.
How L1 and L2 CPU Caches Work, and Why They’re an Essential Part of Modern Chips
Ever been curious how L1 and L2 cache work? We're glad you asked. Here, we deep dive into the structure and nature of one of computing's most fundamental designs and innovations.