Filed under: AMD, AMD 6nm, AMD 6nm GPU, AMD 6nm MCM GPU, AMD Aldebaran, AMD CDNA 2, AMD Instinct, AMD Instinct MI200, AMD Instinct MI210, AMD Instinct MI250, AMD Instinct MI250X, AMD MCM, AMD MCM GPU, Announcement, CDNA 2, Featured, Hardware, Instinct MI210, Instinct MI250, Instinct MI250X, MCM GPU, News, Sticky

AMD Unveils Instinct MI200 ‘Aldebaran’ GPU, First 6nm MCM Product…

AMD has officially announced its future-era MI200 HPC GPU codenamed Aldebaran that utilizes a 6nm CDNA 2 architecture to produce insane compute functionality.

AMD Unveils Instinct MI200, Powering The Upcoming-Gen Compute Powerhouse With First 6nm MCM GPU Technological know-how & In excess of ninety five TFLOPs FP32 Overall performance

AMD is officially the very first to MCM know-how and they are executing so with a grand product which is their Instinct MI200 codenamed Aldebaran. The AMD Aldebaran GPU will appear in different varieties & dimensions but it truly is all based mostly on the manufacturer new CDNA 2 architecture which is the most refined variation of Vega. Some of the major features right before we go into depth are shown under:

  • AMD CDNA two architecture – 2nd Gen Matrix Cores accelerating FP64 and FP32 matrix functions, offering up to 4X the peak theoretical FP64 efficiency vs. AMD former-gen GPUs.
  • Management Packaging Technologies – Marketplace-1st multi-die GPU layout with two.5D Elevated Fanout Bridge (EFB) technologies provides 1.8X more cores and 2.7X higher memory bandwidth vs. AMD earlier-gen GPUs, featuring the industry’s greatest aggregate peak theoretical memory bandwidth at 3.two terabytes for every next.
  • third Gen AMD Infinity Material engineering – Up to eight Infinity Material hyperlinks join the AMD Instinct MI200 with 3rd Gen EPYC CPUs and other GPUs in the node to enable unified CPU/GPU memory coherency and optimize procedure throughput, letting for an a lot easier on-ramp for CPU codes to tap the energy of accelerators.

AMD Instinct MI200 GPU Die Shot:

Within the AMD Intuition MI200 is an Aldebaran GPU featuring two dies, a secondary and a most important. It has two dies with each individual consisting of eight shader engines for a overall of 16 SE’s. Just about every Shader Motor packs sixteen CUs with total-charge FP64, packed FP32 & a 2nd Era Matrix Engine for FP16 & BF16 operations.

Each individual die, as these types of, is composed of 128 compute units or 8192 stream processors. This rounds up to a whole of 220 compute units or 14,080 stream processors for the full chip. The Aldebaran GPU is also driven by a new XGMI interconnect. Every single chiplet functions a VCN two.six motor and the most important IO controller.

Built on AMD CDNA 2 architecture, AMD Instinct MI200 collection accelerators deliver leading application efficiency for a broad set of HPC workloads. The AMD Intuition MI250X accelerator provides up to four.9X improved general performance than competitive accelerators for double precision (FP64) HPC programs and surpasses 380 teraflops of peak theoretical half-precision (FP16) for AI workloads to permit disruptive approaches in additional accelerating knowledge-pushed exploration.

In phrases of general performance, AMD is touting various report wins in the HPC section more than NVIDIA’s A100 option with up to 3x general performance advancements in AMG.

As for DRAM, AMD has absent with an 8-channel interface consisting of 1024-little bit interfaces for an 8192-bit wide bus interface. Each interface can guidance 2GB HBM2e DRAM modules. This should really give us up to 16 GB of HBM2e memory capacity for each stack and since there are 8 stacks in overall, the total total of capability would be a whopping 128 GB. That’s forty eight GB much more than the A100 which residences eighty GB HBM2e memory. The memory will clock in at an insane pace of three.2 Gbps for a comprehensive-on bandwidth of three.two TB/s. This is a entire one.2 TB/s far more bandwidth than the A100 80 GB which has 2 TB/s.

The AMD Instinct MI200 will be powering 3 top rated-tier supercomputers which contain the United States’ exascale Frontier procedure the European Union’s pre-exascale LUMI system and Australia’s petascale Setonix procedure. The opposition contains the A100 80 GB which offers 19.5 TFLOPs of FP64, 156 TFLOPs of FP32 and 312 TFLOPs of FP16 compute ability. But we are likely to listen to about NVIDIA’s very own Hopper MCM GPU up coming calendar year so you can find likely to be a heated level of competition involving the two GPU juggernauts in 2022.

AMD Radeon Instinct Accelerators 2020

Accelerator Title AMD Instinct MI300 AMD Intuition MI250X AMD Instinct MI250 AMD Instinct MI210 AMD Intuition MI100 AMD Radeon Instinct MI60 AMD Radeon Intuition MI50 AMD Radeon Instinct MI25 AMD Radeon Instinct MI8 AMD Radeon Intuition MI6
GPU Architecture TBA (CDNA three) Aldebaran (CDNA 2) Aldebaran (CDNA 2) Aldebaran (CDNA two) Arcturus (CDNA one) Vega twenty Vega twenty Vega 10 Fiji XT Polaris 10
GPU System Node Superior Approach Node 6nm 6nm 6nm 7nm FinFET 7nm FinFET 7nm FinFET 14nm FinFET 28nm 14nm FinFET
GPU Dies four (MCM)? 2 (MCM) 2 (MCM) 2 (MCM) one (Monolithic) one (Monolithic) 1 (Monolithic) one (Monolithic) one (Monolithic) one (Monolithic)
GPU Cores 28,a hundred and sixty? fourteen,080 13,312 TBA 7680 4096 3840 4096 4096 2304
GPU Clock Speed TBA 1700 MHz 1700 MHz TBA ~1500 MHz 1800 MHz 1725 MHz 1500 MHz a thousand MHz 1237 MHz
FP16 Compute TBA 383 TOPs 362 TOPs TBA 185 TFLOPs 29.5 TFLOPs 26.five TFLOPs 24.6 TFLOPs 8.two TFLOPs five.seven TFLOPs
FP32 Compute TBA 95.7 TFLOPs ninety.five TFLOPs TBA 23.1 TFLOPs fourteen.seven TFLOPs 13.three TFLOPs twelve.three TFLOPs 8.2 TFLOPs 5.seven TFLOPs
FP64 Compute TBA forty seven.9 TFLOPs forty five.three TFLOPs TBA 11.5 TFLOPs 7.four TFLOPs 6.6 TFLOPs 768 GFLOPs 512 GFLOPs 384 GFLOPs
VRAM TBA 128 GB HBM2e 128 GB HBM2e TBA 32 GB HBM2 32 GB HBM2 16 GB HBM2 16 GB HBM2 4 GB HBM1 sixteen GB GDDR5
Memory Clock TBA 3.two Gbps 3.two Gbps TBA 1200 MHz a thousand MHz one thousand MHz 945 MHz five hundred MHz 1750 MHz
Memory Bus TBA 8192-bit 8192-bit 8192-little bit 4096-little bit bus 4096-bit bus 4096-little bit bus 2048-bit bus 4096-little bit bus 256-little bit bus
Memory Bandwidth TBA three.2 TB/s 3.2 TB/s TBA one.23 TB/s 1 TB/s 1 TB/s 484 GB/s 512 GB/s 224 GB/s
Kind Component TBA OAM OAM Dual Slot Card Twin Slot, Total Duration Twin Slot, Whole Duration Twin Slot, Total Duration Dual Slot, Total Length Twin Slot, Half Length Single Slot, Entire Length
Cooling TBA Passive Cooling Passive Cooling Passive Cooling Passive Cooling Passive Cooling Passive Cooling Passive Cooling Passive Cooling Passive Cooling
TDP TBA 560W 500W? TBA 300W 300W 300W 300W 175W 150W

The Aldebaran MI200 GPU will appear in a few configurations, the OAM only MI250 and MI250X & the dual-slot PCIe MI210. AMD has only shared comprehensive specs and functionality figures for its MI250 class HPC GPUs. The MI250X capabilities the complete 14,080 configurations and provides forty seven.9, ninety five.7, 383 TFLOPs of FP64/FP32/FP16 when the MI250 capabilities 13,312 cores with 45.3,ninety.five,362.one TFLOPs of FP64/FP32/FP16 functionality. The memory configuration remains the exact in between the two GPU configurations.

AMD Instinct MI200 GPU Package:

The write-up AMD Unveils Intuition MI200 ‘Aldebaran’ GPU, Initially 6nm MCM Item With fifty eight Billion Transistors, Around fourteen,000 Cores & 128 GB HBM2e Memory by Hassan Mujtaba appeared 1st on Wccftech.