It seems to be like AMD’s up coming-gen Instinct MI300 GPU accelerator has produced a achievable 1st visual appearance in the newest Linux patch.
AMD Instinct MI300 ‘GFX940’ GPU, Following-Gen Info Centre MCM Accelerator, Will make Doable Initial Overall look In Linux Patch
The newest Linux Patch has bundled a new focus on for an unreleased AMD ‘GFX940’ GP which has a equivalent ISA as the Aldebaran ‘GFX90a’ GPU. It is speculated that this chip could be powering AMD’s up coming-generation Intuition MI300 GPU accelerator and supports all the facts-centric features this sort of as MFMA (Matrix-Fused-Multiply-Incorporate), complete-charge FP64, and packed FP32 operations. Other attributes also incorporate XNACK which is unique to CPU+GPU memory house integration, as Coelacanth-Dream places it.
The source states that while the GPU ISA is related, the GFX940 does have a handful of variances when in comparison to Aldebaran ‘CDNA 2’ GPUs which are stated down below:
Former rumors have indicated that the AMD Intuition MI300 will feature a 4-GCD design and style centered on the model new CDNA three architecture. The future Instinct MI200 was heading to element 128 compute models for every die but that has transformed to one hundred ten compute units considering that final week’s rumor. A whole of 220 Compute Models would internet fourteen,080 cores and if we choose the correct number and multiply it by 4 (the amount of GCDs on Intuition MI300), we conclusion up with 440 Compute Units or an crazy 28,160 cores.
— Kepler (@Kepler_L2) March 1, 2022
MI300 will aspect four GCDs
— Kepler (@Kepler_L2) September seven, 2021
A current AMD ROCm Developer Tools update that was noticed by Komachi did confirm a utmost of four MCM GPUs but people are merely ‘Aldebaran’ SKUs. There are envisioned to be at least four CDNA two powered Intuition accelerators with their respective (distinctive IDs) shown below. Note that the number doesn’t symbolize the amount of dies on just about every product but fairly the gadget itself:
Now that would be correct if AMD will make no modifications in any respect when moving from CDNA 2 to CDNA three but which is not the circumstance. CDNA 3 is predicted to deliver ahead a revised new architecture that won’t be another Vega by-product like Arcturus or Aldebaran which would make this rumor extra believable.
The GPU architecture may well also use a structure that could possibly stop up searching comparable to the new WGP/SE arrangement on the new RDNA 3 chips or an entirely new design and style tailor-made toward the HPC phase. But one point is for sure, people quad-MCM GPUs surely are something that we are unable to wait to see in motion!
AMD Radeon Intuition Accelerators 2020
|Accelerator Name||AMD Intuition MI300||AMD Intuition MI250X||AMD Intuition MI250||AMD Intuition MI210||AMD Intuition MI100||AMD Radeon Instinct MI60||AMD Radeon Intuition MI50||AMD Radeon Instinct MI25||AMD Radeon Intuition MI8||AMD Radeon Intuition MI6|
|GPU Architecture||TBA (CDNA three)||Aldebaran (CDNA 2)||Aldebaran (CDNA two)||Aldebaran (CDNA two)||Arcturus (CDNA one)||Vega twenty||Vega twenty||Vega 10||Fiji XT||Polaris ten|
|GPU Method Node||Highly developed System Node||6nm||6nm||6nm||7nm FinFET||7nm FinFET||7nm FinFET||14nm FinFET||28nm||14nm FinFET|
|GPU Dies||4 (MCM)?||two (MCM)||2 (MCM)||1 (MCM)||1 (Monolithic)||1 (Monolithic)||1 (Monolithic)||1 (Monolithic)||one (Monolithic)||one (Monolithic)|
|GPU Clock Pace||TBA||1700 MHz||1700 MHz||~1700 MHz?||~1500 MHz||1800 MHz||1725 MHz||1500 MHz||1000 MHz||1237 MHz|
|FP16 Compute||TBA||383 TOPs||362 TOPs||~176 TOPs||185 TFLOPs||29.five TFLOPs||26.five TFLOPs||24.six TFLOPs||eight.two TFLOPs||five.seven TFLOPs|
|FP32 Compute||TBA||ninety five.seven TFLOPs||ninety.5 TFLOPs||~44 TFLOPs||23.1 TFLOPs||14.7 TFLOPs||13.3 TFLOPs||twelve.3 TFLOPs||8.2 TFLOPs||5.7 TFLOPs|
|FP64 Compute||TBA||forty seven.9 TFLOPs||forty five.three TFLOPs||~22 TFLOPs||eleven.5 TFLOPs||seven.four TFLOPs||six.6 TFLOPs||768 GFLOPs||512 GFLOPs||384 GFLOPs|
|VRAM||TBA||128 GB HBM2e||128 GB HBM2e||sixty four GB HBM2e||32 GB HBM2||32 GB HBM2||16 GB HBM2||sixteen GB HBM2||4 GB HBM1||sixteen GB GDDR5|
|Memory Clock||TBA||three.2 Gbps||3.two Gbps||three.2 Gbps?||1200 MHz||one thousand MHz||1000 MHz||945 MHz||five hundred MHz||1750 MHz|
|Memory Bus||TBA||8192-little bit||8192-little bit||4096-bit||4096-little bit bus||4096-bit bus||4096-bit bus||2048-bit bus||4096-little bit bus||256-little bit bus|
|Memory Bandwidth||TBA||three.2 TB/s||3.2 TB/s||1.6 TB/s||one.23 TB/s||one TB/s||1 TB/s||484 GB/s||512 GB/s||224 GB/s|
|Kind Variable||TBA||OAM||OAM||Dual Slot Card||Dual Slot, Whole Size||Dual Slot, Full Size||Dual Slot, Whole Size||Dual Slot, Complete Duration||Dual Slot, 50 percent Size||One Slot, Full Duration|
|Cooling||TBA||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling|