Lawrence Livermore National Laboratory Adds New HPC Clusters

Discussion in 'HardForum Tech News' started by cageymaru, Nov 12, 2018.

  1. cageymaru

    cageymaru [H]ard as it Gets

    Messages:
    19,467
    Joined:
    Apr 10, 2003
    The Lawrence Livermore National Laboratory has added a new high performance computing (HPC) cluster called Corona in partnership with Penguin Computing that features AMD and Mellanox Technologies. The unfinished HPC cluster will allow researchers and industry partners to explore data science, machine learning and big data analytics. The system will feature AMD EPYC processors, AMD Radeon Instinct GPUs, and a Mellanox HDR 200 Gigabit InfiniBand network. The HPC will be used by the National Nuclear Security Administration's (NNSA) and Computing (ASC) program. It will be dedicated to partnerships with American industry. "The cluster consists of 170 two-socket nodes incorporating 24-core AMD EPYC 7401 processors and a PCIe 1.6 Terabyte (TB) nonvolatile (solid-state) memory device. Each Corona compute node is GPU-ready with half of those nodes utilizing four AMD Radeon Instinct MI25 GPUs per node, delivering 4.2 petaFLOPS of FP32 peak performance. The remaining compute nodes may be upgraded with future GPUs."

    In other HPC news, the Sierra HPC supercomputer has taken the 2nd slot on the list of fastest supercomputers. It is also housed at the Lawrence Livermore National Laboratory. It is an IBM/NVIDIA system for the NNSA that "reached 94.6 petaflops (quadrillion floating-point operations per second) on the High Performance Linpack (HPL), a benchmark test that TOP500 uses to determine a system's speed. The score, up from 71.6 in June, pushed Sierra past China's Sunway TaihuLight, a system developed by China's National Research Center of Parallel Computer Engineering & Technology (NRCPC)." This means that 5 Department of Energy supercomputers are in the top 10 of the TOP500 list. Sierra uses IBM POWER9 processors featuring CPU-to-GPU connection via NVIDIA NVLink interconnect for maximum memory bandwidth to feed the 17,000 Tesla Tensor Core V100 GPUs.

    "AMD welcomes the delivery of the Corona system to the HPCIC and the selection of high-performance AMD EPYC processors and AMD Radeon Instinct accelerators for the cluster," said Mark Papermaster, AMD's senior vice president and chief technology officer. "The collaboration between AMD, Penguin, Mellanox and Lawrence Livermore National Labs has built a world-class HPC system that will enable researchers to push the boundaries of science and innovation."
     
  2. ole-m

    ole-m Limp Gawd

    Messages:
    406
    Joined:
    Oct 5, 2015
    Epyc adoption isn't surprising.
    I expect 40\60 split in diving zen1 vs xeon skylake sp.

    What is interesting and impressive is that Radeon Instict is actually being adopted.
     
  3. benedict

    benedict n00b

    Messages:
    37
    Joined:
    Nov 13, 2018
    Based on the numbers given here, simple math tells us it takes 80.95 Radeon Instinct MI25 GPUs to deliver 1 petaFLOPS, while we need 179.7 Tesla Tensor Core V100 GPUs to do the same. Win for the Radeons or just bad scaling for the Teslas?
     
  4. N4CR

    N4CR 2[H]4U

    Messages:
    3,612
    Joined:
    Oct 17, 2011
    Interesting indeed.
    Probably depends on workload (FP32 etc). I would not be surprised if for scientific use that the precision is needed and this is where AMD is doing well.