To celebrate the 40th anniversary of the 8086 and the debut of the x86 architecture, we’re launching a new retrospective on some of Intel’s most important CPU designs. In this article, we’ve rounded up the first decades of history, from the 4004 in 1971 to the Pentium Pro in 1994. This period covers the first two eras of Moore’s Law (a concept we’ve discussed elsewhere), in which discrete capabilities were rapidly integrated on to a single contiguous wafer, and then as microprocessor transistor counts and clock speeds continued to rise.
Intel's 4004 was released in 1971 and represented a true milestone in computer history — it's the first commercial microprocessor completely integrated into a single chip. Designed by Federico Faggin, the 4004 ran at 740KHz, used a four-bit microarchitecture, and took eight compute cycles to execute a complete instruction. It was built for a Japanese company, Busicom, that had contracted Intel to build a line of chips suitable for use in future calculators. It was branded as the 4004 as part of a new naming scheme meant to identify the entire 4000-family as parts of an integrated whole (the 4000-4003 part numbers referred to other components within the design. After renegotiating their agreement with Busicom, Intel acquired the rights to market the design to other companies and began selling the MCS-4 micro computer set by late 1971. This design used 2,300 transistors.
The Intel 8008 traded some clock speed — it ran at 500KHz, as opposed to the 4004's 720KHz — for enhanced capabilities. While somewhat slower, it could operate on eight bits of data at a time, rather than the four-bit limitation of the 4004. Initially commissioned before the 4004, the chip was late to market and the company that initially ordered it, Computer Terminal Corporation, later decided not to use the design. The company agreed to allow Intel to market the chip independently, and the 8008 became a commercial success. It was used in the French-built Micral-N minicomputer in 1972 and the Canadian MCM/70 beginning in 1974. The chip used 3500 transistors, which made it significantly more complicated than the 4004 (back then, a few thousand transistors was a serious engineering effort.) Federico Faggin didn't lead the entire project, but he finalized the design and it was finished under his guidance. In 2008, Faggin called the 8008 "the ancestor of the Pentium Pro."
The Intel 8080 was a 1974 design that followed the 8008 and drastically increased its clock speed, from 500KHz to 2MHz (later chips at up to 3.125MHz would be released). The 8080 was the first chip Faggin designed from scratch, and it used the n-channel MOS process (the 4004 and 8008 had used p-channel MOS), with fewer support chips required and 6000 transistors in the base design. Intel, back then, was known more for its memory than its CPUs, but the 8080 was a huge success in the budding microcomputer market, with prominent design wins and broad industry uptake. The 8080 was the base CPU model used for the fledgling operating system, CP/M. Intel would follow up the 8080 with the 8085 — a binary-compatible CPU with support for 5V operation and several new instructions to support new interrupts.
The 8086 and 8088 were conceived of as short-term solutions that would generate revenue for Intel while it worked on its "real" next-generation chip, the i432APX. The 8086 and 8088 are nearly identical, but the 8086 implemented a 16-bit internal and external buses, while the 8088 connected to the rest of the system via an 8-bit bus. Ironically, Intel built the 8086 and 8088 partly to fend off competition from Federico Faggin, who had left the company and started his own competing business, Zilog. The "86" in the CPU name came from the fact that it had 8 general-purpose 16-bit registers. The final chip contained 20,000 transistors (29,000 if you counted the ROM and PLA), measured 33mm sq and was built on 3.2-micron technology. It's the grandfather of the modern x86 CPU industry.
Today, x86 chips are the backbone of modern computing. ARM may dominate the smartphone industry, but the cloud-based services and platforms that smartphones rely on are sitting in data centers running on x86-based hardware. What’s surprising, looking back, is that no one at Intel had even an inkling that this was going to take place. Intel had sunk its hopes and dreams into the i432APX, a 32-bit microprocessor with a radically different design than anything the company had tried before. Early sales of the 8086 and 8088 weren’t very strong, since the entire computer market was facing something of a hardware glut. Intel’s Operation Crush, an aggressive marketing and support effort around the 8086, helped change that and caught IBM’s attention in the process.
Enter IBM. When Big Blue decided to build its first PC, it narrowed the field to three choices: Motorola’s 68000, the Intel 8086, and the Intel 8088. Because the 8088 and 8086 were compatible with each other, it ultimately didn’t matter which Intel CPU IBM picked. IBM was more familiar with Intel than Motorola and Microsoft had a BASIC interpreter with x86 support already baked in. If IBM had gone the other way, we might well be sitting here talking about the rise of “Motosoft” instead of “Mintel.” IBM’s decision to back Intel shaped the future of computing, and Intel’s future processors. Over the next few years, OEMs like Compaq brought new systems to market, powered by new, more advanced x86 CPUs.
Below, we’ll discuss the next series of Intel CPUs, starting with the 80286 and running through the Pentium Pro. The 80186, while it technically existed, was actually primarily used as an embedded microcontroller rather than a PC CPU (with a bare handful of exceptions). For most, the line of succession jumped from the 8086/8088 to the 80286.
The Intel 80286 was the successor to the 8086 (the 80186 was mostly used in embedded markets). It drove IBM's then-new PC/AT platform and was available in 4MHz, 6MHz, 8MHz, and 12.5MHz varieties. The 80286 packed 134,000 transistors and was substantially faster than the 8086 clock-for-clock thanks to de-multiplexed address and data buses, a dedicated adder, and a hardware-based multiplier. In some cases, the 80286 was up to twice as fast as the older 8086, even when running at the same clock speed. The 80286 was also the first Intel CPU designed for multi-user systems and the first x86 chip to feature Protected Mode, though bugs and limitations with the 286's implementation meant this feature wasn't widely used. The 80286 could address up to 16MB of RAM, and while Intel never built a version of the chip above 12.5MHz, AMD and Harris both constructed variants clocked at 20MHz and 25MHz respectively.
The 80386 was a substantial step forward for Intel's CPU designs. Built on 1.5-micron and 1-micron technology, it was Intel's first 32-bit x86 processor and the first Intel processor that could theoretically address up to 4GB of memory. It expanded and extended the 80286's Protected Mode and included a new virtual 8086 mode that allowed the 386 to emulate multiple 8086 chips simultaneously. This allowed the 386 to execute programs that could only run in Real Mode while using a Protected Mode operating system. The 80386 also supported a flat memory model while in Protected Mode that allowed programs to treat memory as a single, contiguous address space, even though the CPU itself actually used a segmented memory model. The 386SL supported off-chip cache memory mounted on the motherboard (16KB-64KB). As with the 286, other companies introduced 80386 clones — AMD released a 386 clocked up to 40MHz, on a 40MHz bus compared with the 25 and 33-MHz buses Intel supported.
The 80486 launched in 1989, three years after the initial debut of the 80386. It was the first Intel CPU to contain over a million transistors, the first Intel x86 chip with an on-die L1 cache, and the first tightly-pipelined x86 core (a tight pipeline is one in which each stage performs its operations within the same time slot). The 80486 was the first Intel chip to implement a clock multiplier, in which the CPU runs at multiples of the base bus speed, and the first Intel chip that could regularly complete simple instructions within a single clock cycle. It was also the first Intel chip to implement an on-die FPU (the 486DX had this feature, while the 486SX and previous 386DX chips did not). Intel initially attempted to push bus speeds up to 50MHz but was forced to cut them back due to stability and heat problems.
Intel Pentium and Pentium MMX
The Pentium debuted in 1993 as Intel's first non-numerical x86 processor brand. The new CPU included dual integer pipelines, meaning it could sometimes complete up to two instructions per clock cycle (this is referred to as a superscalar architecture). It used a 64-bit external bus instead of the 486's 32-bit bus and it offered separate instruction and data caches to boost performance. FPU performance was also dramatically improved. Intel would later introduce the Pentium MMX, with support for new multimedia instructions (MMX technology), larger caches, and higher overall performance. The Pentium and Pentium MMX have continued to be relevant to Intel's modern CPU business — its initial Atom core (Bonnell) drew inspiration from the 80486 and P5, as did the first few generations of the Xeon Phi (Knights Ferry, Knights Corner). When Intel wanted to demonstrate near-threshold voltage transistors, it again turned to the Pentium core to do so. While Intel's competitors again introduced their own chips, it was becoming increasingly difficult for the other x86 vendors to build products that could go toe-to-toe with Intel.
Intel Pentium Pro
The Intel Pentium Pro is the great-great-grandfather of virtually every consumer CPU Intel has built over the past 21 years. It was the first Intel chip to translate x86 instructions into internal micro-ops and to reorder those translated operations for optimal execution within the CPU. Today, this is a common technique known as out-of-order execution, but it was radical at the time and cost Intel a significant amount of die space and power consumption compared with the Pentium. The rated TDP on a Pentium 200 was 15.5W, while a Pentium MMX had a 15.7W TDP. TDP for the Pentium Pro at 200MHz, in comparison, was 35W. This was partly due to the chip's large off-die L2 cache, which consumed significant amounts of power as well. The Pentium Pro's performance improvements were primarily linked to the use of 32-bit code; it could outperform an equivalent-clocked Pentium by 25-35 percent in these tests, but was only 5 percent faster when benchmarked in 16-bit code. This limited the chip's adoption, since MS-DOS, Windows 3.1, and Windows 95 were all entirely or significantly based on 16-bit code.
The 8086 to Pentium can arguably be grouped as a single family of products, albeit a family that evolved enormously in less than 20 years. All of these chips executed native x86 instructions using what we now call in-order execution (prior to the invention of out-of-order execution we just called this “execution.”) Intel rose to dominate the personal computing market on the strength of these cores. In October 1985, the fastest 80386DX was clocked at 12MHz. By June of 1995, the Pentium 133 was on-sale — a greater-than 10x speed improvement, on top of all the architectural improvements, in just a decade.
By this point, Intel had already largely conquered the personal computer market and begun making early inroads into the workstation and data center spaces, but the bulk of the market still belonged to various RISC architectures backed by entrenched players like Sun, MIPS, and HP. Intel wanted to expand into data centers and professional workstations, but to do that it needed a CPU architecture that would allow it to compete against these high-end workstation chips on absolute performance. Intel had added manufacturing capacity through the 1980s and 1990s, and any new chip needed to do more than simply boost performance — it needed to be a CPU that could leverage Intel’s growing economies of scale.
The Pentium Pro and its descendants were that CPU. We’ll discuss how they evolved — and the features they brought to market — in our next installment.