Oak Ridge Countrywide Laboratory has released the overview of its Crusher procedure which is powered by AMD’s Optimized 3rd Gen EPYC CPUs & Intuition MI250X GPUs.
ORNL’s All AMD-Run Crusher Program Overview Posted: Attributes Optimized third Gen EPYC CPUs & Intuition MI250X GPUs
The Crusher technique is a take a look at system for ORNL’s upcoming Frontier supercomputer which will aspect the most up-to-date AMD EPYC ‘Trento’ CPUs and Intuition MI250X ‘Aldebaran’ GPUs. As such, it has lesser amount of nodes but even so, it packs a lot of punch presented the significant amount of money of CPU/GPU cores that are showcased within just it.
Crusher is an National Centre for Computational Sciences (NCCS) moderate-safety program that contains equivalent hardware and related computer software as the impending Frontier method. It is utilized as an early-accessibility testbed for Centre for Accelerated Software Readiness (CAAR) and Exascale Computing Job (ECP) teams as effectively as NCCS personnel and our vendor companions.
The overview revealed by ORNL states that the Crushes take a look at process will consist of two cabinets, a person with 128 compute nodes and the other with sixty four compute nodes, totaling 192 compute nodes in the whole configuration. Each node functions a solitary sixty four-core AMD EPYC 7A53 CPU which is primarily based on the third Gen Optimized EPYC CPU architecture. We know that Frontier is heading to be driven by AMD’s Trento CPUs which is an optimized variation of the Milan chip. It characteristics the same sixty four cores and 128 threads but optimizations to clocks and ability effectiveness. Every single CPU will have access to 512 GB DDR4 memory.
For the GPU side, each node will characteristic four AMD Instinct MI250X GPUs, packing two GCDs and every single node treats the GCD as a independent GPU so Crusher will have obtain to eight GPUs in full. Each individual MI250X GPU delivers up to fifty two TFLOPs of peak FP64 compute horsepower, 220 compute models (one hundred ten for every GCD) & 128 GB of HBM2e memory (64 GB per GPU) for up to three.2 TB/s bandwidth for every MI250X accelerator. Each GCD is linked jointly via an Infinity Material backlink that delivers 200 GB/s bi-directional bandwidth.
Conversing about interconnects, the AMD EPYC CPUs are related to the GPU with Infinity Fabric with a peak bandwidth of 36+36 GB/s. The Crusher nodes are connected by means of 4 HPE Slingshot two hundred Gbit per 2nd NICs (twenty five GB/s) furnishing a node-injection bandwidth of 800 Gbps (100 GB/s).
There are [4x] NUMA domains for every node and [2x] L3 cache locations for every NUMA for a whole of [8x] L3 cache regions. The eight GPUs are each and every affiliated with one of the L3 regions as follows:
- components threads 000-007, 064-071 | GPU 4
- hardware threads 008-015, 072-079 | GPU 5
- hardware threads 016-023, 080-087 | GPU two
- hardware threads 024-031, 088-095 | GPU three
- components threads 032-039, 096-103 | GPU six
- hardware threads 040-047, 104-111 | GPU 7
- hardware threads 048-055, 112-119 | GPU
- hardware threads 056-063, a hundred and twenty-127 | GPU 1
The next block diagram of a singular Crusher node displays the inter-link bandwidths in between the AMD EPYC CPUs and Intuition MI250X GPU accelerators:
In addition to that, the Crusher system also hots 250 PB of storage with a peak compose speed of 2.five TB/s, with accessibility to the middle-wide NFS-centered filesystem. Be expecting to see additional from AMD’s EPYC CPU and Instinct GPU platforms when they grow to be operational in the Frontier supercomputer this year.