AMD-Powered Frontier Supercomputer Breaks the Exascale Barrier, Now Fastest in the World

Frontier is the first officially recognized exaflop supercomputer in the world. It tops the Top500 list of the worlds’ fastest supercomputers as its AMD-powered systems have grown dramatically this year. Frontier not just beats the previous leader, Japan’s Fugaku, but blows them all out of the water. Frontier is faster than the other 7 supercomputers on the Top500 list, combined. Notably while Frontier hit 1.10 ExaFLOPs during a sustained LinPack FP64 benchmark, the machine delivers up to 1.70 ExaFLOPS in peak performance but has room to reach 2 ExaFLOPS after more tuning. For comparison, one ExaFLOP equals one quintillion floating-point operations per second.

In addition to its high performance, Frontier is also the most energy efficient supercomputer in the world. It ranked number two on the Green500 list, behind IBM’s Summit system. Frontier delivers 52.23 gigaflops per watt, while consuming just 21.1 megawatts of electricity at peak performance. At that level of efficiency, Frontier could theoretically operate for nearly 30 years before requiring an upgrade.

AMD’s EPYC processor family powered systems now comprise five of 10 of the TOP 500 supercomputers in the World! AMD EPYC powered systems now appear in 94 of the TOP500 supercomputers in world, marking a steady rise since November 2019 when AMD EPYC powered systems appeared in 73 of the TOP500 supercomputer systems. AMD EPYC processors also appear in more than half of all new systems added to the TOP500 this year.

AMD dominates the green500 list, powering the four most energy efficient supercomputers in the world. Not only does AMD dominate the list, but it also has eight of the ten and seventeen of the twenty most efficient systems. The Frontier supercomputer is powered by AMD, and is installed at the DOE’s Oak Ridge National Lab in Tennessee. The system includes 9,408 compute nodes with one 64 core Trento processor paired with 512GB of DDR4 memory and 4 AMD Radeon Instinct MI 250x graphics cards. These nodes are spread out across 74 HPE Cray XC40 cabinets, each weighing 8k lbs. The total system has 602, 112 CPU cores, all paired to 4.6 PB of DDR4 RAM.

AMD’s new supercomputers are designed to tackle big problems. The machines are built around dual AMD EPYC Rome processors running at 2.2GHz each, along with 128GB of DDR4 RAM and two NVIDIA Tesla V100 accelerators. Each accelerator contains 24 NVLink ports, giving them access to 16GB of high-bandwidth memory. These accelerators also contain four PCIe Gen 3 x16 slots, allowing them to connect to other components. The systems will be housed inside a custom chassis that includes 12 fans and a total of six 120mm cooling fans. The system will run Windows Server 2016 and Red Hat Enterprise Linux 7.3.

The entire network is connected to an insanely fast storage subsystem with 700 petabyte of capacity, 75TB/s of throughput, 15 billion IOPS of speed, and 15 million IOPS of latency. Metadata is stored across 480NVME SSDs that provide 10 PB of capacity, while 5,300 NVME SSDs provide 11.8 PB of capacity for the primary storage tier. Meanwhile, 4,800PMR HDDs provide 679 PB of capacity.

Assembling Frontier was a huge project. It took six years to complete and required 60 million parts. ORNL had to source all of them, including 685 different part numbers. The chip shortage hit mid-construction, impacting 167 of those parts. So ORNL had to find replacements for those parts. AMD also ran out of MI200 GPU chips, which were critical for the project. To help circumvent the shortage, ORNL worked with ASCR to get DPAS ratings for the parts. That meant the US government invoked the Defenestration Act to procure the parts because of Frontier’s importance to national security.

Frontier is the first commercially available Exascale supercomputer. It is capable of processing 100 petaflops of computing power. That means it could perform 1 quintillion calculations per second. By comparison, the fastest supercomputer today, Titan, processes about 10 petaflops, meaning it could calculate roughly 0.1 quadrillion calculations per second.

Intel’s Aurora system will be the first exascale system. However, there is still some doubt about whether the submitted results are accurate. The Top 500 list is compiled using a benchmark called Linpack, which measures floating point operations per second. The submitted result is a proxy for the real thing. The submitted result is also not the same as the actual number of flops that the system can actually achieve. There is a chance that the submitted result is not accurate.

Aurora, a 3+ ExaFlop supercomputer expected to come online in 2021. Upon its completion, this Intel-powered supercomputer will compete with the El Capitan for the title of the world’s fastest supercomputer. CPUs.

