RTX 2080 vs. Radeon VII vs. 5700 XT: Rendering and Compute Performance

Most of our GPU coverage focuses on the consumer side of the business and on game benchmarking, but I promised to examine the compute side of performance back when the Radeon VII launched. With the 5700 XT having debuted recently, we had an opportunity to return to this question with a new GPU architecture from AMD and compare RDNA against GCN.

In fact, the overall compute situation is at an interesting crossroads. AMD has declared that it wishes to be a more serious player in enterprise compute environments but has also said that GCN will continue to exist alongside RDNA in this space. The Radeon VII is a consumer variant of AMD’s MI50 accelerator, with half-speed FP64 support. If you know you need double-precision FP64 compute, for example, the Radeon VII fills that niche in a way that no other GPU in this comparison does.

The Radeon VII has the highest RAM bandwidth and it’s the only GPU in this comparison to offer much in the way of double-precision performance. But while these GPUs have relatively similar on-paper specs, there’s significant variance between them in terms of performance — and the numbers don’t always break the way you think they would.

Regarding Blender 2.80

Our test results contain data from both Blender 2.80 and the standalone Blender benchmark, 1.0beta2 (released August 2018). Blender 2.80 is a major release for the application, and it contains a number of significant changes. The standalone benchmark is not compatible with Nvidia’s RTX family, which necessitated testing with the latest version of the software. Initially, we tested the Blender 2.80 beta, but then the final version dropped — so we dumped the beta results and retested.

There are significant performance differences between the Blender 1.0beta2 benchmark and 2.80 and one scene, Classroom, does not render properly in the new version. This scene has been dropped from our 2.80 comparisons. Blender allows the user to specify a tile size in pixels to control how much of the scene is worked on at once. Code in the Blender 1.0beta2 benchmark’s Python files indicates that the test uses a tile size of 512×512 (X/Y coordinates) for GPUs and 16×16 for CPUs. Most of the scene files actually contained within the benchmark, however, actually use a tile size of 32×32 by default if loaded within Blender 2.80.

We tested Blender 2.80 in two different modes. First, we tested all compatible scenes using the default tile size those scenes loaded with. This was 16×16 for Barbershop_Interior, and 32×32 for all other scenes. Next, we tested the same renders with a default tile size of 512×512. Up until now, the rule with tile sizes has been that larger sizes were good for GPUs, while smaller sizes were good for CPUs. This appears to have changed somewhat with Blender 2.80. AMD and Nvidia GPUs show very different responses to larger tile sizes, with AMD GPUs accelerating with higher tile sizes and Nvidia GPUs losing performance.

Because the scene files we are testing were created in an older version of Blender, it’s possible that this might be impacting our overall results. We have worked extensively with AMD for several weeks to explore aspects of Blender performance on GCN GPUs. GCN, Pascal, Turing, and RDNA all show a different pattern of results when moving from 32×32 to 512×512, with Turing losing less performance than Pascal and RDNA gaining more performance in most circumstances than GCN.

All of our GPUs benefited substantially from not using a 16×16 tile size for Barbershop_Interior. While this test defaults to 16×16 it does not render very well at that tile size on any GPU.

Troubleshooting the different results we saw in the Blender 1.0Beta2 benchmark versus the Blender 2.80 beta and finally Blender 2.80 final has held up this review for several weeks and we’ve swapped through several AMD drivers while working on it. All of our Blender 2.80 results were, therefore, run using Adrenaline 2019 Edition 19.8.1.

Test Setup and Notes

All GPUs were tested on an Intel Core i7-8086K system using an Asus Prime Z370-A motherboard. The Vega 64, Radeon RX 5700 XT, and Radeon VII were all tested using Adrenalin 2019 Edition 19.7.2 (7/16/2019) for everything but Blender 2.80. All Blender 2.80 tests were run using 19.8.1, not 19.7.2. The Nvidia GeForce GTX 1080 and Gigabyte Aorus RTX 2080 were both tested using Nvidia’s 431.60 Game Ready Driver (7/23/2019).

CompuBench 2.0 runs GPUs through a series of tests intended to measure various aspects of their compute performance. Kishonti, developers of CompuBench, don’t appear to offer any significant breakdown on how they’ve designed their tests, however. Level set simulation may refer to using level sets for the analysis of surfaces and shapes. Catmull-Clark Subdivision is a technique used to create smooth surfaces. N-body simulations are simulations of dynamic particle systems under the influence of forces like gravity. TV-L1 optical flow is an implementation of an optical flow estimation method, used in computer vision.

SPEC Workstation 3.1 contains many of the same workloads as SPECViewPerf, but also has additional GPU compute workloads, which we’ll break out separately. A complete breakdown of the workstation test and its application suite can be found here. SPEC Workstation 3.1 was run in its 4K native test mode. While this test run was not submitted to SPEC for formal publication, our testing of SPEC Workstation 3.1 obeyed the organization’s stated rules for testing, which can be found here.

We’ve cooked up two sets of results for you — a synthetic series of benchmarks, created with SiSoft Sandra and investigating various aspects of how these chips compare, including processing power, memory latency, and internal characteristics, and a wider suite of tests that touch on compute and rendering performance in various applications. Since the SiSoft Sandra 2020 tests are all unique to that application, we’ve opted to break them out into their own slideshow.

The Gigabyte Aorus RTX 2080 results should be read as approximately equivalent to an RTX 2070S. The two GPUs perform nearly identically in consumer workloads and should match each other in workstation as well.

SiSoft Sandra 2020

SiSoft Sandra is a general-purpose system information utility and full-featured performance evaluation suite. While it’s a synthetic test, it’s probably the most full-featured synthetic evaluation utility available, and Adrian Silasi, its developer, has spent decades refining and improving it, adding new features and tests as CPUs and GPUs evolve.

Our SiSoft Sandra-specific results are below. Some of our OpenCL results are a little odd where the 5700 XT is concerned, but according to Adrian, he’s not yet had the chance to optimize code for execution on the 5700 XT. Consider these results to be preliminary — interesting, but perhaps not yet indicative — as far as that GPU is concerned.

Sandra's general-purpose GPGPU compute test measures performance in multiple metrics; we've chosen to focus on half-precision, single-precision, and double-precision floating-point performance. The Radeon VII offers far stronger support for double-precision floating point, reflected in its far higher test scores. The RTX 2080 wins this test as far as half-precision FPU performance, ties in single-precision, and does not distinguish itself in DP.

The RX 5700 XT failed in this test when run with OpenCL. The Radeon VII's enormous memory bandwidth gives it an edge over the RTX 2080 in all test modes, while Vega 64 offers competitive encryption/decryption and cryptographic performance, but falls behind RTX 2080 in the hashing workload.

We broke the Black-Scholes test out from the other financial model evaluations because the performance differences were too large to graph properly. The low-precision OpenCL test vastly favored the Radeon VII, while the 5700 XT and RTX 2080 were evenly matched. Vega 64's high precision performance is significantly better than the 5700 XT's, but again, optimizations could play a major role here. The less said about the GTX 1080, the better.

For scientific analysis, we focused on general matrix multiplication. Performance figures here generally match the results we've cataloged in the previous tests.

Sandra's image processing test shows the Radeon VII leading all other cards, though Vega 64 performs well here. Low performance from the Radeon VII may be related to OCL optimizations.

Just as with CPUs, the maximum useable memory bandwidth is always lower than the maximum theoretical. Sandra's version of this test shows lower figures than we might have expected, but the only real surprise is the 5700 XT out-classing Vega 64.

We first tested RAM latency with OCL/CUDA, but the scores didn't look right. Different types of cache access patterns have very different latencies, but the 924ns latency for full random on 5700 XT was odd. RAM latencies didn't show any particular relation to performance patterns in Sandra or in other tests.

We retested RAM latency with D3D11 instead of OpenCL/CUDA to see if that would change the final results. GCN and RDNA scores improved dramatically in this API, though the Radeon 5700 XT still scored oddly.

Our SiSoft Sandra 2020 benchmarks point largely in the same direction. If you need double-precision floating-point, the Radeon VII is a compute monster. While it’s not clear how many buyers fall into that category, there are certain places, like image processing and high-precision workloads, where the Radeon VII shines.

The RDNA-based Radeon 5700 XT does less to distinguish itself in these tests, but we’re also in contact with Silasi concerning the issues we ran into during testing. Improved support may change some of these results in months ahead.

Test Results

Now that we’ve addressed Sandra performance, let’s turn to the rest of our benchmark suite. Our other results are included in the slideshow below:

IndigoBench is a standalone rendering benchmark that's based on Indigo Render, an unbiased, photorealistic GPU and CPU renderer. Performance is in millions of samples per second, with the Core i7-8086K's performance provided for reference. In the first scene, Bedroom, the Radeon RX 5700 XT beats Radeon VII and Vega 64, falling only to the RTX 2080. The gap between Nvidia and everyone else is significantly larger in Supercar, where the Gigabyte Aorus leads Radeon VII by 1.59x. The 5700 XT is slightly faster than the Radeon VII here as well.

Not too many surprises or upsets here. The Radeon VII wins both of these sub-tests with ease.

CompuBench favors the Radeon VII overall, but there are specific tests where the RTX 2080 takes victory, like Catmull-Clark subdivision. The 5700 XT may require specific optimizations for its architecture; it generally matches the Vega 64 but isn't nearly as fast in the TV-L1 test and couldn't run n-body tests at all. The Radeon VII is between 1.14x and 1.58x faster than the Vega 64, depending on the test.

The first three rendering tests — Catia, Creo, and Energy — are decided wins for the Radeon RX 5700 XT, which beats the Radeon VII in all three tests and handily outpaces the RTX 2080 as well. The gap between Vega and Radeon VII is much smaller than the gap between the 5700 XT and Radeon VII.

Radeon VII's regression in SNX-03 is unusual, but so is RDNA's performance. Even the RTX 2080 is left in the dust by AMD's latest GPU architecture. Professional GPU applications appear to love this graphics card. SNX-03 is a particular blowout for AMD's RDNA.

SPEC Workstation's final graphics test is less a blowout and more a general-purpose beating. The Radeon 5700 XT only wins one of three tests and by smaller margins. The Radeon VII takes home the showcase-02 benchmark, while the RTX 2080 triumphs in the 3dsmax-06 application test. Overall, the 5700 XT makes an amazingly strong argument for itself in professional graphics applications, winning far more tests than it loses, especially for a $400 GPU going up against cards in the $500 - $700 range.

Finally, we've got SPEC's GPU compute applications: Folding At Home, Luxrender, and Caffe. FAH wouldn't run on the 5700 XT, so we don't have results for it. The Radeon VII wins Luxrender, the 5700 XT wins SPEC's Caffe benchmark, and the RTX 2080 takes home a win in Folding @ Home.

LuxMark includes three scenes, at varying complexities. The Radeon VII dominates all three benchmarks, though the RTX 2080 puts up a much better fight than the GTX 1080. The Radeon RX 5700 XT continues to struggle with OpenCL, a fact we'll see repeated in a bit when we move to SiSoft Sandra. Performance ranges from matching Vega 64 to behind even the GTX 1080, which undoubtedly appreciates being allowed to win something.

We used the Blender 1.0beta2 benchmark for our first round of Blender testing. The Gigabyte Aorus is omitted from these results due to non-compatibility. The 5700 XT is faster than Vega 64 in 5 out of 6 tests and beats the Radeon VII in two. The GTX 1080 is flatly uncompetitive in this scenario and the RTX 2080 doesn't run.

Our first suite of Blender 2.80 tests uses the default tile sizes that these scenes are programmed to use — 16x16 in Barbershop, 32x32 in all other cases. Render times are vastly improved on the GTX 1080 compared with the standalone test, but GCN GPUs take a massive hit in Barbershop_Interior and are negatively impacted two other tests. The RTX 2080 has a strong leadership position in this test at low tile size.

Increasing tile size to 512x512 dramatically improves GCN and RDNA results. While the 5700 XT doesn't get as much improvement out of Barbershop_Interior as the Vega 64 and Radeon VII, it shows the most consistent improvement across all tests. Nvidia GPUs, in contrast, get worse in every scene but Barbershop_Interior. Barbershop_Interior's 16x16 default is simply too low. RDNA wins two tests (Barbershop, Koro), GCN wins one (Pavilion_Barcelona), and Turing takes BMW27 and Fishy_Cat.

Conclusions

What do these results tell us? A lot of rather interesting things. First of all, RDNA is downright impressive. Keep in mind that we’ve tested this GPU in professional and compute-oriented applications, none of which have been updated or patched to run on it. There are clear signs that this has impacted our benchmark results, including some tests that either wouldn’t run or it ran slowly. Even so, the 5700 XT impresses.

Radeon VII impresses too, but in different ways than the 5700 XT. SiSoft Sandra 2020 shows the advantage this card can bring to double-precision workloads, where it offers far more performance than anything else on the market. AI and machine learning have become much more important of late, but if you’re working in an area where GPU double-precision is key, Radeon VII packs an awful lot of firepower. SiSoft Sandra does include tests that rely on D3D11 rather than OpenCL. But given that OpenCL is the chief competitor to CUDA, I opted to stick with it in all cases save for the memory latency tests, which globally showed lower latencies for all GPUs when D3D was used compared with OpenCL.

AMD has previously said that it intends to keep GCN in-market for compute, with Navi oriented towards the consumer market, but there’s no indication that the firm intends to continue evolving GCN on a separate trajectory from RDNA. The more likely meaning of this is that GCN won’t be replaced at the top of the compute market until Big Navi is ready at some point in 2020. Based on what we’ve seen, there’s a lot to be excited about on that front. There are already applications where RDNA is significantly faster than Radeon VII, despite the vast difference between the cards in terms of double-precision capability, RAM bandwidth, and memory capacity.

Blender 2.80 presents an interesting series of comparisons between RDNA, GCN, and CUDA. Using higher tile sizes has an enormous impact on GPU performance, but whether that difference is good or bad depends on which brand of GPU you use and which architectural family it belongs to. Pascal and Turing GPUs performed better with smaller tile sizes, while GCN GPUs performed better with larger ones. The 512×512 tile size was better in total for all GPUs, but only because it improved the total rendering time on Barbershop_Interior by more than it harmed the render time of every other scene for Turing and Pascal GPUs. The RTX 2080 was the fastest GPU in our Blender benchmarks, but the 5700 XT put up excellent performance results overall.

I do not want to make global pronouncements about Blender 2.80 settings; I am not a 3D rendering expert. These test results suggest that Blender performs better with larger tile settings on AMD GPUs but that smaller tile settings may produce better results for Nvidia GPUs. In the past, both AMD and Nvidia GPUs have benefited from larger tile sizes. This pattern could also be linked to the specific scenes in question, however. If you run Blender, I suggest experimenting with different scenes and tile sizes.

Ultimately, what these results suggest is that there’s more variation in GPU performance in some of these professional markets than we might expect for gaming. There are specific tests where the 5700 XT is markedly faster than the RTX 2080 or Radeon VII and other tests where it falls sharply behind them. OpenCL driver immaturity may account for some of this, but we see flashes of brilliance in these performance figures. The Radeon VII’s double-precision performance put it in a class of its own in certain respects, but the Radeon RX 5700 XT is a far less expensive and quieter card. Depending on what your target application is, AMD’s new $400 GPU might be the best choice on the market. In other scenarios, both the Radeon VII and the RTX 2080 make specific and particular claim to being the fastest card available.

Feature image is the final render of the Benchmark_Pavilion scene included in the Blender 1.02beta standalone benchmark.

Continue reading

Intel Launches AMD Radeon-Powered CPUs

Intel's new Radeon+Kaby Lake hybrid CPUs are headed for store shelves. Here's how the SKUs break down and what you need to know.

AMD’s New Radeon RX 6000 Series Is Optimized to Battle Ampere

AMD unveiled its RX 6000 series today. For the first time since it bought ATI in 2006, there will be some specific advantages to running AMD GPUs in AMD platforms.

Voyager 2 Probe Talks to Upgraded NASA Network After 8 Months of Silence

NASA just said "hello" to Voyager 2, and the probe said it back.

Huawei Sells Honor Brand Amid Tightening Trade Restrictions

(Credit: Kevin Frayer/Getty Images)Huawei has been battered by US trade restrictions in the last few years, and it’s taking a toll on the company’s long-term stability. Experts don’t expect a radical change when the new US administration comes to power next year, so Huawei is beginning to take drastic action. It has sold its Honor…