Lawrence Livermore National Laboratory Adds New HPC Clusters

cageymaru · Nov 12, 2018

The Lawrence Livermore National Laboratory has added a new high performance computing (HPC) cluster called Corona in partnership with Penguin Computing that features AMD and Mellanox Technologies. The unfinished HPC cluster will allow researchers and industry partners to explore data science, machine learning and big data analytics. The system will feature AMD EPYC processors, AMD Radeon Instinct GPUs, and a Mellanox HDR 200 Gigabit InfiniBand network. The HPC will be used by the National Nuclear Security Administration's (NNSA) and Computing (ASC) program. It will be dedicated to partnerships with American industry. "The cluster consists of 170 two-socket nodes incorporating 24-core AMD EPYC 7401 processors and a PCIe 1.6 Terabyte (TB) nonvolatile (solid-state) memory device. Each Corona compute node is GPU-ready with half of those nodes utilizing four AMD Radeon Instinct MI25 GPUs per node, delivering 4.2 petaFLOPS of FP32 peak performance. The remaining compute nodes may be upgraded with future GPUs."

In other HPC news, the Sierra HPC supercomputer has taken the 2nd slot on the list of fastest supercomputers. It is also housed at the Lawrence Livermore National Laboratory. It is an IBM/NVIDIA system for the NNSA that "reached 94.6 petaflops (quadrillion floating-point operations per second) on the High Performance Linpack (HPL), a benchmark test that TOP500 uses to determine a system's speed. The score, up from 71.6 in June, pushed Sierra past China's Sunway TaihuLight, a system developed by China's National Research Center of Parallel Computer Engineering & Technology (NRCPC)." This means that 5 Department of Energy supercomputers are in the top 10 of the TOP500 list. Sierra uses IBM POWER9 processors featuring CPU-to-GPU connection via NVIDIA NVLink interconnect for maximum memory bandwidth to feed the 17,000 Tesla Tensor Core V100 GPUs.

"AMD welcomes the delivery of the Corona system to the HPCIC and the selection of high-performance AMD EPYC processors and AMD Radeon Instinct accelerators for the cluster," said Mark Papermaster, AMD's senior vice president and chief technology officer. "The collaboration between AMD, Penguin, Mellanox and Lawrence Livermore National Labs has built a world-class HPC system that will enable researchers to push the boundaries of science and innovation."

ole-m · Nov 12, 2018

Epyc adoption isn't surprising.
I expect 40\60 split in diving zen1 vs xeon skylake sp.

What is interesting and impressive is that Radeon Instict is actually being adopted.

benedict · Nov 13, 2018

Based on the numbers given here, simple math tells us it takes 80.95 Radeon Instinct MI25 GPUs to deliver 1 petaFLOPS, while we need 179.7 Tesla Tensor Core V100 GPUs to do the same. Win for the Radeons or just bad scaling for the Teslas?

N4CR · Nov 13, 2018

benedict said:
Based on the numbers given here, simple math tells us it takes 80.95 Radeon Instinct MI25 GPUs to deliver 1 petaFLOPS, while we need 179.7 Tesla Tensor Core V100 GPUs to do the same. Win for the Radeons or just bad scaling for the Teslas?

Interesting indeed.
Probably depends on workload (FP32 etc). I would not be surprised if for scientific use that the precision is needed and this is where AMD is doing well.

Lawrence Livermore National Laboratory Adds New HPC Clusters

cageymaru

Fully [H]

ole-m

Limp Gawd

benedict

n00b

N4CR

Supreme [H]ardness