AMD has officially announced its future-era MI200 HPC GPU codenamed Aldebaran that utilizes a 6nm CDNA 2 architecture to produce insane compute functionality.
AMD Unveils Instinct MI200, Powering The Upcoming-Gen Compute Powerhouse With First 6nm MCM GPU Technological know-how & In excess of ninety five TFLOPs FP32 Overall performance
AMD is officially the very first to MCM know-how and they are executing so with a grand product which is their Instinct MI200 codenamed Aldebaran. The AMD Aldebaran GPU will appear in different varieties & dimensions but it truly is all based mostly on the manufacturer new CDNA 2 architecture which is the most refined variation of Vega. Some of the major features right before we go into depth are shown under:
- AMD CDNA two architecture – 2nd Gen Matrix Cores accelerating FP64 and FP32 matrix functions, offering up to 4X the peak theoretical FP64 efficiency vs. AMD former-gen GPUs.
- Management Packaging Technologies – Marketplace-1st multi-die GPU layout with two.5D Elevated Fanout Bridge (EFB) technologies provides 1.8X more cores and 2.7X higher memory bandwidth vs. AMD earlier-gen GPUs, featuring the industry’s greatest aggregate peak theoretical memory bandwidth at 3.two terabytes for every next.
- third Gen AMD Infinity Material engineering – Up to eight Infinity Material hyperlinks join the AMD Instinct MI200 with 3rd Gen EPYC CPUs and other GPUs in the node to enable unified CPU/GPU memory coherency and optimize procedure throughput, letting for an a lot easier on-ramp for CPU codes to tap the energy of accelerators.
Within the AMD Intuition MI200 is an Aldebaran GPU featuring two dies, a secondary and a most important. It has two dies with each individual consisting of eight shader engines for a overall of 16 SE’s. Just about every Shader Motor packs sixteen CUs with total-charge FP64, packed FP32 & a 2nd Era Matrix Engine for FP16 & BF16 operations.
Each individual die, as these types of, is composed of 128 compute units or 8192 stream processors. This rounds up to a whole of 220 compute units or 14,080 stream processors for the full chip. The Aldebaran GPU is also driven by a new XGMI interconnect. Every single chiplet functions a VCN two.six motor and the most important IO controller.
Built on AMD CDNA 2 architecture, AMD Instinct MI200 collection accelerators deliver leading application efficiency for a broad set of HPC workloads. The AMD Intuition MI250X accelerator provides up to four.9X improved general performance than competitive accelerators for double precision (FP64) HPC programs and surpasses 380 teraflops of peak theoretical half-precision (FP16) for AI workloads to permit disruptive approaches in additional accelerating knowledge-pushed exploration.
In phrases of general performance, AMD is touting various report wins in the HPC section more than NVIDIA’s A100 option with up to 3x general performance advancements in AMG.
As for DRAM, AMD has absent with an 8-channel interface consisting of 1024-little bit interfaces for an 8192-bit wide bus interface. Each interface can guidance 2GB HBM2e DRAM modules. This should really give us up to 16 GB of HBM2e memory capacity for each stack and since there are 8 stacks in overall, the total total of capability would be a whopping 128 GB. That’s forty eight GB much more than the A100 which residences eighty GB HBM2e memory. The memory will clock in at an insane pace of three.2 Gbps for a comprehensive-on bandwidth of three.two TB/s. This is a entire one.2 TB/s far more bandwidth than the A100 80 GB which has 2 TB/s.
The AMD Instinct MI200 will be powering 3 top rated-tier supercomputers which contain the United States’ exascale Frontier procedure the European Union’s pre-exascale LUMI system and Australia’s petascale Setonix procedure. The opposition contains the A100 80 GB which offers 19.5 TFLOPs of FP64, 156 TFLOPs of FP32 and 312 TFLOPs of FP16 compute ability. But we are likely to listen to about NVIDIA’s very own Hopper MCM GPU up coming calendar year so you can find likely to be a heated level of competition involving the two GPU juggernauts in 2022.
AMD Radeon Instinct Accelerators 2020
|Accelerator Title||AMD Instinct MI300||AMD Intuition MI250X||AMD Instinct MI250||AMD Instinct MI210||AMD Intuition MI100||AMD Radeon Instinct MI60||AMD Radeon Intuition MI50||AMD Radeon Instinct MI25||AMD Radeon Instinct MI8||AMD Radeon Intuition MI6|
|GPU Architecture||TBA (CDNA three)||Aldebaran (CDNA 2)||Aldebaran (CDNA 2)||Aldebaran (CDNA two)||Arcturus (CDNA one)||Vega twenty||Vega twenty||Vega 10||Fiji XT||Polaris 10|
|GPU System Node||Superior Approach Node||6nm||6nm||6nm||7nm FinFET||7nm FinFET||7nm FinFET||14nm FinFET||28nm||14nm FinFET|
|GPU Dies||four (MCM)?||2 (MCM)||2 (MCM)||2 (MCM)||one (Monolithic)||one (Monolithic)||1 (Monolithic)||one (Monolithic)||one (Monolithic)||one (Monolithic)|
|GPU Cores||28,a hundred and sixty?||fourteen,080||13,312||TBA||7680||4096||3840||4096||4096||2304|
|GPU Clock Speed||TBA||1700 MHz||1700 MHz||TBA||~1500 MHz||1800 MHz||1725 MHz||1500 MHz||a thousand MHz||1237 MHz|
|FP16 Compute||TBA||383 TOPs||362 TOPs||TBA||185 TFLOPs||29.5 TFLOPs||26.five TFLOPs||24.6 TFLOPs||8.two TFLOPs||five.seven TFLOPs|
|FP32 Compute||TBA||95.7 TFLOPs||ninety.five TFLOPs||TBA||23.1 TFLOPs||fourteen.seven TFLOPs||13.three TFLOPs||twelve.three TFLOPs||8.2 TFLOPs||5.seven TFLOPs|
|FP64 Compute||TBA||forty seven.9 TFLOPs||forty five.three TFLOPs||TBA||11.5 TFLOPs||7.four TFLOPs||6.6 TFLOPs||768 GFLOPs||512 GFLOPs||384 GFLOPs|
|VRAM||TBA||128 GB HBM2e||128 GB HBM2e||TBA||32 GB HBM2||32 GB HBM2||16 GB HBM2||16 GB HBM2||4 GB HBM1||sixteen GB GDDR5|
|Memory Clock||TBA||3.two Gbps||3.two Gbps||TBA||1200 MHz||a thousand MHz||one thousand MHz||945 MHz||five hundred MHz||1750 MHz|
|Memory Bus||TBA||8192-bit||8192-bit||8192-little bit||4096-little bit bus||4096-bit bus||4096-little bit bus||2048-bit bus||4096-little bit bus||256-little bit bus|
|Memory Bandwidth||TBA||three.2 TB/s||3.2 TB/s||TBA||one.23 TB/s||1 TB/s||1 TB/s||484 GB/s||512 GB/s||224 GB/s|
|Kind Component||TBA||OAM||OAM||Dual Slot Card||Twin Slot, Total Duration||Twin Slot, Whole Duration||Twin Slot, Total Duration||Dual Slot, Total Length||Twin Slot, Half Length||Single Slot, Entire Length|
|Cooling||TBA||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling|
The Aldebaran MI200 GPU will appear in a few configurations, the OAM only MI250 and MI250X & the dual-slot PCIe MI210. AMD has only shared comprehensive specs and functionality figures for its MI250 class HPC GPUs. The MI250X capabilities the complete 14,080 configurations and provides forty seven.9, ninety five.7, 383 TFLOPs of FP64/FP32/FP16 when the MI250 capabilities 13,312 cores with 45.3,ninety.five,362.one TFLOPs of FP64/FP32/FP16 functionality. The memory configuration remains the exact in between the two GPU configurations.
AMD Instinct MI200 GPU Package:
The write-up AMD Unveils Intuition MI200 ‘Aldebaran’ GPU, Initially 6nm MCM Item With fifty eight Billion Transistors, Around fourteen,000 Cores & 128 GB HBM2e Memory by Hassan Mujtaba appeared 1st on Wccftech.