Filed under: AD102, AD102 GPU, Ada Lovelace, Ada Lovelace GPU, Featured, GeForce, GeForce RTX, GeForce RTX 40, Hardware, News, nVidia, NVIDIA AD102, NVIDIA ADA Lovelace, NVIDIA Ada Lovelace GPU, Nvidia GeForce, NVIDIA GeForce RTX, NVIDIA GeForce RTX 40, Nvidia GPU, NVIDIA Graphics, Nvidia graphics card, NVIDIA RTX, RTX 40, Rumor, Sticky

NVIDIA Ada Lovelace ‘GeForce RTX 40’ Gaming GPU Detailed: Double…

Particulars regarding the NVIDIA Ada Lovelace Gaming GPU which will electrical power the GeForce RTX 40 collection graphics playing cards have been discovered. The new data comes from Kopte7kimi & talks about the block diagram of the next-gen architecture.

NVIDIA GeForce Ada Lovelace GPU SM Block Diagram Thorough: Greater & Superior Than At any time For Players!

The NVIDIA Ada Lovelace GPU architecture is no secret any more. We have uncovered the specific configurations that will energy the future Gen AD10* series SKUs for GeForce RTX forty collection graphics cards and we have also found leaked specs of the lineup. Now, it truly is time to talk purely about the subsequent-era graphics chip alone.

NVIDIA AD102 ‘Ada Lovelace’ Gaming GPU ‘SM’ Block Diagram (Image Credits: Kopite7kimi):

NVIDIA GA102 ‘Ampere’ Gaming GPU ‘SM’ Block Diagram:

Commencing with the GPU configuration, Kopite7kimi compares the best AD102 GPU to different other GPUs from the eco-friendly group. These include the gaming-focused Ampere GA102 and Turing TU102 whilst there is also the HPC-Focused Hopper GH100 and Ampere GA100 extra to the record. I am going to only compare the AD102 to its gaming predecessors considering that the HPC-concentrated patterns are vastly different than purchaser-centric choices.

The NVIDIA Ada Lovelace AD102 GPU will attribute up to twelve GPC (Graphics Processing Clusters). This is an increase of 70% compared to GA102 which functions only seven GPCs. Each individual GPU will consist of six TPCs and 2 SMs which is the identical configuration as the current chip. Each and every SM (Streaming Multiprocessor) will house 4 sub-cores which is also the exact as the GA102 GPU. What is improved is the FP32 & the INT32 main configuration. Every sub-main will include 128 FP32 models but merged FP32+INT32 models will go up to 192. This is mainly because the FP32 units really don’t share the same sub-core as the IN32 models. The 128 FP32 cores are individual from the 64 INT32 cores.

So in whole, every sub-core will consist of 128 FP32 in addition 64 INT32 models for a total of 192 units. Just about every SM will have a complete of 512 FP32 units as well as 256 INT32 models for a whole of 768 models. And considering the fact that there are a total of 24 SM models (2 per GPC), we are on the lookout at 12,288 FP32 Units and six,a hundred and forty four INT32 units for a total of 18,432 cores. Just about every SM will also involve two Wrap Schedules (32 thread/CLK) for 64 wraps per SM. This is a 50% maximize on the cores (FP32+INT32) and a 33% raise in Wraps/Threads vs the GA102 GPU.

NVIDIA Ada Lovelace GPU Specs ‘Preliminary’:

GPU Name AD102 GA102 TU102 GA100 GH100
GPC twelve (For each GPU) 1.7x 2x 1.5x 1.5x
TPC 6 (Per GPC) Exact same Exact same .75x .67x
SM 2 (For each TPC) Similar Same Exact Exact
Sub-Core 4 (Per SM) Exact same Very same Very same Exact
FP32 128 (For every SM) Exact same 2x 2x Identical
FP32+INT32 192 (For each SM) 1.5x one.5x one.5x Exact same
Warps 64 (For every SM) 1.33x 2x Very same Exact
Threads 2048 (For each SM) one.33x 2x Exact same Same
L1 Cache 192 KB (For each SM) one.5x 2x Exact .75x
L2 Cache 96 MB (For every GPU) 16x 16x two.4x one.6x
ROPs 32 (Per GPC) 2x 2x 2x 2x

Moving above to the cache, this is yet another phase in which NVIDIA has provided a big strengthen over the present Ampere GPUs. The Ada Lovelace GPUs will pack 192 KB of L1 cache for every SM, an increase of 50% about Ampere. Which is a full of 4.5 MB of L1 cache on the best AD102 GPU. The L2 cache will be elevated to ninety six MB as mentioned in the leaks. This is a 16x maximize about the Ampere GPU that hosts just 6 MB of L2 cache. The cache will be shared throughout the GPU.

Ultimately, we have the ROPs which are also improved to 32 for each GPC, an maximize of 2x around Ampere. You are hunting at up to 384 ROPs on the next-gen flagship versus just 112 on the fastest Ampere GPU, the RTX 3090 Ti. There are also likely to be the most current 4th Era Tensor and third Generation RT (Raytracing) cores infused on the Ada Lovelace GPUs which will support enhance DLSS & Raytracing functionality to the up coming stage. Over-all, the Ada Lovelace AD102 GPU will supply:

  • 2x GPCs (As opposed to Ampere)
  • 50% Extra Cores (Versus Ampere)
  • fifty% More L1 Cache (Compared to Ampere)
  • 16x Extra L2 Cache (Versus Ampere)
  • Double The ROPs (Compared to Ampere)
  • 4th Gen Tensor & third Gen RT Cores

Do take note that clock speeds, which are stated to be in between the two-3 GHz selection, are not taken into the equation so they will also enjoy a important job in improving the for each-main general performance as opposed to Ampere. The NVIDIA GeForce RTX 40 series graphics cards showcasing the following-gen Ada Lovelace gaming GPUs are expected to start in the next half of 2022 & are reported to make the most of the exact same TSMC 4N course of action node as the Hopper H100 GPU.


GPU TU102 GA102 AD102
Flagship SKU RTX 2080 Ti RTX 3090 Ti RTX 4090?
Architecture Turing Ampere Ada Lovelace
System TSMC 12nm NFF Samsung 8nm TSMC 4N?
Die Dimension 754mm2 628mm2 ~600mm2
Graphics Processing Clusters (GPC) 6 seven 12
Texture Processing Clusters (TPC) 36 forty two 72
Streaming Multiprocessors (SM) 72 eighty four one hundred forty four
CUDA Cores 4608 10752 18432
L2 Cache six MB six MB ninety six MB
Theoretical TFLOPs 16 TFLOPs forty TFLOPs ~ninety TFLOPs?
Memory Ability 11 GB (2080 Ti) 24 GB (3090 Ti) 24 GB (4090?)
Memory Speed fourteen Gbps 21 Gbps 24 Gbps?
Memory Bandwidth 616 GB/s 1.008 GB/s 1152 GB/s?
Memory Bus 384-little bit 384-little bit 384-little bit
PCIe Interface PCIe Gen three. PCIe Gen four. PCIe Gen 4.
TGP 250W 350W 600W?
Launch Sep. 2018 Sept. twenty 2H 2022 (TBC)

The write-up NVIDIA Ada Lovelace ‘GeForce RTX 40’ Gaming GPU Thorough: Double The ROPs, Huge L2 Cache & 50% A lot more FP32 Models Than Ampere, 4th Gen Tensor & third Gen RT Cores by Hassan Mujtaba appeared to start with on Wccftech.