Leaks Reveal Nvidia 40-Series With Massive L2 Cache, Almost Double the CUDA Cores

Leaks Reveal Nvidia 40-Series With Massive L2 Cache, Almost Double the CUDA Cores

A consortium of Twitter users have been poring over the recently leaked data from the Nvidia hack by the rogue group Lapsus$ and posting their findings online. Thus far, the leaks confirm previous rumors that Nvidia’s next-gen offering will indeed raise the bar. Not only will Nvidia’s upcoming Ada Lovelace GPU field a much larger L2 cache, the flagship chip will also reportedly offer almost double the number of CUDA cores found in the current chip, the GA102.

The largest change for Nvidia is a staggering 16x increase in total L2 cache on the AD102 compared to existing Ampere GPUs, from 6MB to 96MB, according to a summary via Tom’s Hardware. The AD102 chip will supposedly come with 16MB of cache per 64-bit memory controller, and with the expected 384-bit memory controller that equates to 96MB of cache total. The current GA102 Ampere chip has just 512KB of cache per 32-bit memory controller, so it’s a substantial increase, and one seemingly designed to rival AMD’s Infinity Cache solution from its RDNA2 RX 6800 GPUs. Interestingly using more cache as opposed to more memory controllers is one method to restrain power consumption, which is ironic as the AD102 die has previously been rumored to consume up to 85oW of power.

This could — could — indicate that Nvidia’s L2 cache consumes significantly more power than AMD’s L3. This would not necessarily be surprising; L1 consumes more power per KB than L2, and L2 consumes more than L3. Alternately, it could indicate that Nvidia is targeting aggressive clocks or that the new GPU targets very high power consumption to deliver maximum performance.

Leaks Reveal Nvidia 40-Series With Massive L2 Cache, Almost Double the CUDA Cores

Such an increase in memory bandwidth is necessary due to the associated increase in CUDA cores, according to Twitter user ftiwvoe via Videocardz. It’s also being reported that AD102 will sport 18,432 cores, a 71 percent boost from GA102’s 10,752 on the upcoming RTX 3090 Ti. Ampere’s current 936GB/s memory bandwidth would simply be insufficient to keep that many cores fed, so adding a lot of extra cache is likely a better solution that adding more power-hungry memory controllers. All the “Lovelace” dies will receive a lot more cache too, with the smaller AD103 and AD104 chips packing 64MB and the AD106 with 48MB. The baby of the bunch, the AD107, will receive just 32MB, which is still 6x the amount in the current GA102 flagship.

As Tom’s Hardware notes, this seems like a very clear case of Nvidia cribbing from AMD’s approach with its RDNA2 GPUs, as it’s choosing to just add more cache instead of a wider memory bus. The rumors indicate Nvidia has no indication in changing the width of any configurations for next-gen, as opposed to going all the way with a 512-bit or even 1024-bit memory bus. There may be good historical reason for this. In the past, both AMD and Nvidia have occasionally fielded GPUs with very wide memory buses, but such cards tend to offer relatively low efficiency. It may have made more sense to use larger caches instead.

As it stands, the RX 6800 GPUs still have even more cache than the rumored RTX 40-series GPU with 128MB of Infinity Cache for both GPUs in the product stack. However, it’s also possible AMD might be keen on upping that for its RDNA3 GPUs, which are rumored to be coming in the second half of 2022 alongside Nvidia’s new cards in September.

Continue reading

MSI’s Nvidia RTX 3070 Gaming X Trio Review: 2080 Ti Performance, Pascal Pricing
MSI’s Nvidia RTX 3070 Gaming X Trio Review: 2080 Ti Performance, Pascal Pricing

Nvidia's new RTX 3070 is a fabulous GPU at a good price, and the MSI RTX 3070 Gaming X Trio shows it off well.

Nvidia Will Mimic AMD’s Smart Access Memory on Ampere: Report
Nvidia Will Mimic AMD’s Smart Access Memory on Ampere: Report

AMD's Smart Access Memory hasn't even shipped yet, but Nvidia claims it can duplicate the feature.

Nvidia Unveils Ampere A100 80GB GPU With 2TB/s of Memory Bandwidth
Nvidia Unveils Ampere A100 80GB GPU With 2TB/s of Memory Bandwidth

Nvidia announced an 80GB Ampere A100 GPU this week, for AI software developers who really need some room to stretch their legs.

Nvidia, Google to Support Cloud Gaming on iPhone Via Web Apps
Nvidia, Google to Support Cloud Gaming on iPhone Via Web Apps

Both Nvidia and Google have announced iOS support for their respective cloud gaming platforms via progressive web applications. Apple can't block that.