AMD’s flagship Instinct MI200 is on the verge of launch and it will be the first GPU for the HPC segment to element an MCM style centered on the CDNA two architecture. It appears to be like the GPU will provide some crazy general performance quantities in comparison to the existing Intuition MI100 GPU with a 4x enhance in FP16 compute.
AMD Instinct MI200 With CDNA 2 MCM GPU Style Heading To HPC Shortly, Characteristics Monstrous Overall performance Numbers & A 4x Compute Boost Over Intuition MI100
We have bought to learn the specifications of the Intuition MI200 accelerator around time but its all round efficiency figures have remained a thriller right until now. Twitter insider and leaker, ExecutableFix, has shared the 1st general performance metrics for AMD’s CDNA two dependent MCM GPU accelerator and it is really a beast.
1.7GHz boost clock, like you explained: extremely substantial
— ExecutableFix (@ExecuFix) Oct 23, 2021
In accordance to tweets by ExecutableFix, the AMD Intuition MI200 will be rocking a clock pace of up to 1.7 GHz which is a 13% raise more than the Intuition MI100. The CDNA 2 driven MCM GPU also rocks nearly twice the selection of stream processors at 14,080 cores, packed in 220 Compute Models. Whilst it was expected that the GPU would rock 240 Compute units with 15,360 cores, the config is replaced by a cut-down variant due to yields. With that stated, it is doable that we may perhaps see the entire SKU start in the long term, supplying even higher functionality.
383 FP16/BF16
— ExecutableFix (@ExecuFix) Oct 23, 2021
In terms of effectiveness, the AMD Instinct MI200 HPC Accelerator is heading to offer practically fifty TFLOPs (47.9) TFLOPs of FP64 & FP32 compute horsepower. Vs . the Instinct MI100, this is a 4.16x improve in the FP64 segement. In simple fact, the FP64 quantities of the MI200 exceed the FP32 general performance of its predecessor. Transferring above to the FP16 and BF16 quantities, we are hunting at an crazy 383 TFLOPs of efficiency. For viewpoint, the MI100 only presents ninety two.3 TFLOPs of peak BFloat16 overall performance and 184.6 TFLOPs peak FP16 functionality.
As for each HPCWire, the AMD Instinct MI200 will be powering a few prime-tier supercomputers which include the United States’ exascale Frontier procedure the European Union’s pre-exascale LUMI technique and Australia’s petascale Setonix technique. The competition consists of the A100 80 GB which delivers 19.five TFLOPs of FP64, 156 TFLOPs of FP32 and 312 TFLOPs of FP16 compute electricity. But we are probable to listen to about NVIDIA’s own Hopper MCM GPU following year so you can find going to be a heated level of competition concerning the two GPU juggernauts in 2022.
Here’s What To Hope From AMD Intuition MI200 ‘CDNA 2’ GPU Accelerator
Within the AMD Intuition MI200 is an Aldebaran GPU that includes two dies, a secondary and a key. It has two dies with every consisting of eight shader engines for a full of sixteen SE’s. Each individual Shader Engine packs 16 CUs with comprehensive-rate FP64, packed FP32 & a 2nd Technology Matrix Engine for FP16 & BF16 functions. Just about every die, as these kinds of, is composed of 128 compute units or 8192 stream processors. This rounds up to a complete of 220 compute models or 14,080 stream processors for the overall chip. The Aldebaran GPU is also driven by a new XGMI interconnect. Every chiplet attributes a VCN 2.six engine and the main IO controller.
As for DRAM, AMD has gone with an 8-channel interface consisting of 1024-little bit interfaces for an 8192-little bit vast bus interface. Each individual interface can help 2GB HBM2e DRAM modules. This must give us up to 16 GB of HBM2e memory capacity per stack and considering that there are eight stacks in whole, the whole sum of ability would be a whopping 128 GB. That’s 48 GB a lot more than the A100 which residences eighty GB HBM2e memory. The complete visualization of the Aldebaran GPU on the Intuition MI200 is out there here.
AMD Radeon Instinct Accelerators 2020
Accelerator Name | AMD Intuition MI300 | AMD Intuition MI200 | AMD Instinct MI100 | AMD Radeon Instinct MI60 | AMD Radeon Intuition MI50 | AMD Radeon Instinct MI25 | AMD Radeon Instinct MI8 | AMD Radeon Intuition MI6 |
---|---|---|---|---|---|---|---|---|
GPU Architecture | TBA (CDNA 3) | Aldebaran (CDNA two) | Arcturus (CDNA one) | Vega twenty | Vega twenty | Vega ten | Fiji XT | Polaris ten |
GPU System Node | Superior System Node | Highly developed System Node | 7nm FinFET | 7nm FinFET | 7nm FinFET | 14nm FinFET | 28nm | 14nm FinFET |
GPU Dies | four (MCM)? | two (MCM) | 1 (Monolithic) | 1 (Monolithic) | one (Monolithic) | 1 (Monolithic) | 1 (Monolithic) | 1 (Monolithic) |
GPU Cores | 28,160? | fourteen,080? | 7680 | 4096 | 3840 | 4096 | 4096 | 2304 |
GPU Clock Speed | TBA | ~1700 MHz | ~1500 MHz | 1800 MHz | 1725 MHz | 1500 MHz | 1000 MHz | 1237 MHz |
FP16 Compute | TBA | 383 TOPs | 185 TFLOPs | 29.5 TFLOPs | 26.five TFLOPs | 24.six TFLOPs | eight.2 TFLOPs | five.7 TFLOPs |
FP32 Compute | TBA | ninety five.8 TFLOPs | 23.1 TFLOPs | fourteen.7 TFLOPs | thirteen.3 TFLOPs | twelve.3 TFLOPs | eight.2 TFLOPs | five.7 TFLOPs |
FP64 Compute | TBA | forty seven.nine TFLOPs | eleven.five TFLOPs | seven.4 TFLOPs | 6.six TFLOPs | 768 GFLOPs | 512 GFLOPs | 384 GFLOPs |
VRAM | TBA | sixty four/128 GB HBM2e? | 32 GB HBM2 | 32 GB HBM2 | 16 GB HBM2 | sixteen GB HBM2 | four GB HBM1 | sixteen GB GDDR5 |
Memory Clock | TBA | TBA | 1200 MHz | 1000 MHz | 1000 MHz | 945 MHz | five hundred MHz | 1750 MHz |
Memory Bus | TBA | 8192-little bit | 4096-little bit bus | 4096-bit bus | 4096-little bit bus | 2048-bit bus | 4096-little bit bus | 256-little bit bus |
Memory Bandwidth | TBA | ~two TB/s? | one.23 TB/s | one TB/s | 1 TB/s | 484 GB/s | 512 GB/s | 224 GB/s |
Form Variable | TBA | Dual Slot, Total Duration / OAM | Twin Slot, Full Length | Twin Slot, Comprehensive Length | Dual Slot, Comprehensive Size | Dual Slot, Total Length | Dual Slot, 50 percent Duration | One Slot, Whole Size |
Cooling | TBA | Passive Cooling | Passive Cooling | Passive Cooling | Passive Cooling | Passive Cooling | Passive Cooling | Passive Cooling |
TDP | TBA | TBA | 300W | 300W | 300W | 300W | 175W | 150W |
The write-up AMD Instinct MI200 CDNA two MCM GPU Is A Beast: one.seven GHz Clocks, 47.9 TFLOPs FP64 & About 4X Enhance In FP64/BF16 General performance Over MI100 by Hassan Mujtaba appeared very first on Wccftech.