On Monday, Nvidia announced a new set of GPUs in a presentation focused on ray tracing and the advent of ray tracing in today’s games. Nvidia has built an entirely new capability around ray tracing and announced an all-new SM (Streaming Multiprocessor) architecture around it. But Nvidia also debuted GPUs at substantially higher prices than its previous generation, and it showed no straight benchmark data that didn’t revolve around ray tracing.
The list goes on. The first DX10 cards weren’t particularly fast, including models like Nvidia’s GTX 8800 Ultra. The first AMD GPUs to support tessellation in DX11 weren’t all that good at it. If you bought a VR headset and a top-end Pascal, Maxwell, or AMD GPU to drive it, guess what? By the time VR is well-established, if it ever is, you’ll be playing it on very different and vastly improved hardware. The first strike against buying into RTX specifically is that by the time ray tracing is well-established, practically useful, and driving modern games, the RTX 2080 will be a garbage GPU. That’s not an indictment of Nvidia, it’s a consequence of the substantial lead time between when a new GPU feature is released and when enough games take advantage of that feature to make it a serious perk.
But there’s also some reason to ask just how much performance these GPUs are going to deliver, period, and Nvidia left substantial questions on the table on that point. The company showed no benchmarks that didn’t involve ray tracing. To try and predict what we might see from this new generation, let’s take a look at what past cards delivered. We’re helped in this by [H]ardOCP, which recently published a massive generational comparison of the GTX 780 versus the GTX 980 and 1080. They tested a suite of 14 games from Crysis 3 to Far Cry 5. Let’s compare the GPUs to the rate of performance improvement and see what we can tease out:
There’s a lot going on in this chart, so let’s break it down. When Nvidia moved from Kepler to Maxwell, we see evidence that they made the core far less dependent on raw memory bandwidth (the GTX 980 has markedly less than the 780), but that this lost Nvidia nothing in overall performance. Maxwell was a better-balanced architecture than Kepler, and Nvidia successfully delivered huge performance improvements without a node shift. But while Maxwell used less bandwidth than Kepler did, it still benefited from a huge increase in fill rate, and the overall improvement across 14 games tracks that fill-rate boost. Clock speeds also increased substantially. The percentage comparison data from [H]ardOCP reflects the 14 game improvement for the GTX 980 compared with the GTX 780, and then from the GTX 1080 compared with the GTX 980.
Maxwell to Kepler duplicates this improvement. Fill rate increases a monstrous 1.6x, thanks to the increased clocks (ROPs were identical). Bandwidth surged on the adoption of GDDR5X, and the overall improvement to gaming performance is directly in line with these gains. The point here is this: While any given game may gain more or less depending on the specifics of the engine and the peculiarities of its design, the average trend shows a strong relationship between throwing more bandwidth and fill rate at games and the performance of those titles.
The Ray Tracing Future Isn’t Here Yet
Furthermore, once you look at what Nvidia is using RTX for, it’s clear that the company isn’t actually going to deliver completely ray-traced games. Instead, the focus here is on using ray tracing to handle certain specific tasks, like improved noise reduction or shadow work. And that’s fine, as far as it goes. Screenshots from PCGamesN (provided by Nvidia’s comparison tools) show that RTX can make a nice difference in certain scenes:
But the RTX hardware in the Nvidia GPU, including the RTX 2080 Ti, isn’t going to be fast enough to simply ray trace an entire AAA game. Even if it was, game engines themselves are not designed for this. This point simply cannot be emphasized enough. There are no ray tracing engines for gaming right now. It’s going to take time to create them. At this stage, the goal of RTX and Microsoft DTX is to allow ray tracing to be deployed in certain areas of game engines where rasterization does poorly, and ray tracing could offer better visual fidelity at substantially less performance cost.
That’s not a new realization. When I wrote about ray tracing in 2012, one point I came across is that there are certain areas where ray tracing can actually be faster than rasterization while providing a higher-quality result. Combining the two techniques in the same engine is tricky, keeping the ray tracing fast enough to work in real time is tricky, and Nvidia and Microsoft both deserve credit for pulling it off — but keep in mind precisely what you’re buying into, here. Despite the implications of the hype train, you aren’t going to be playing a game that looks like a ray-traced version of Star Wars any time soon, because the GPU that would deliver that kind of fidelity and resolution doesn’t exist. Demos are always going to look better than shipping products because demos aren’t concerned with emulating the entire game world — just the pretty visuals.
Look to the RTX’s features to provide a nominal boost to image quality. But don’t expect the moon. And never, ever, buy a GPU for a feature someone has promised you will appear at a later date. Buy a GPU for the features it offers today, in shipping titles, that you can definitely take advantage of.
What Can We Say About RTX Performance?
I’m unwilling to declare the RTX 2080’s performance a settled question because numbers don’t always tell the whole story. When Nvidia overhauled its GPUs from Fermi to Kepler, it moved to a dramatically different architecture. The ability to predict performance as a result of comparing core counts and bandwidth broke as a result. I haven’t seen any information that Turing is as large a departure from Pascal as Kepler was from Fermi, but it’s always best to err on the side of caution until formal benchmark data is available. If Nvidia fundamentally reworked its GPU cores, it’s possible that the gains could be much larger than simple math suggests.
Nonetheless, simple math suggests the gains here are not particularly strong. When you combine that with the real-but-less-than-awe-inspiring gains from the incremental addition of ray tracing into shipping engines and the significant price increases Nvidia has tacked on, there’s good reason to keep your wallet in your pocket and wait and see how this plays out. But the only way the RTX 2080 is going to deliver substantial performance improvements above Pascal, over and above the 1.2x – 1.3x suggested by core counts and bandwidth gains, is if Nvidia has pulled off a huge efficiency gain in terms of how much work can be done per SM.