Intel Xeon Phi x200 Knight’s Landing x86 Compatibility Test

Discussion in 'HardForum Tech News' started by Megalith, Dec 10, 2016.

  1. Megalith

    Megalith 24-bit/48kHz Staff Member

    Messages:
    13,004
    Joined:
    Aug 20, 2006
    Okay, so nobody is going to run Windows regularly on these 64-core beasts, but it’s cool to see that Intel’s claim of Knight’s Landing being capable of running legacy x86 code with little fuss was true. Thanks to Patriot for the link.

    The run did not utilize any of the KNL features that give the platform its true power such as AVX512. That testing is coming. We did validate one of the major selling points of KNL. Code made to run on other Intel x86 architectures will work without modification on the new Intel Xeon Phi x200 (KNL) generation. The impact of this is enormous and is a key reason we saw so many KNL supercomputer system wins at SC16 this year. Applications can access a highly parallelized architecture without needing a co-processor. The Intel Xeon Phi x200 has direct access to the hex-channel system RAM as well as 16GB of high-bandwidth/ low latency MCDRAM without a performance penalty from traversing the PCIe bus.
     
  2. atp1916

    atp1916 [H]ard|DCoTM x1

    Messages:
    3,680
    Joined:
    Jun 18, 2004
    That's pretty damn cool.

    Cinebench R15 run = :D:D:D
     
  3. Quartz-1

    Quartz-1 [H]ardness Supreme

    Messages:
    4,257
    Joined:
    May 20, 2011
    Why not? Microsoft have migrated Windows and Office to non-x86 platforms before. Windows Compute Server anyone?
     
  4. pxc

    pxc [H]ard as it Gets

    Messages:
    33,064
    Joined:
    Oct 22, 2000
    64 Airmont-based cores with 4 threads per core. OK. :p

    As the guy explained in the video, that isn't the best test of the Xeon Phi's abilities, and I agree with atp1916 above that it's a cool demo.
     
  5. Zarathustra[H]

    Zarathustra[H] Official Forum Curmudgeon

    Messages:
    27,656
    Joined:
    Oct 29, 2000
    I wonder what type of loads benefit from four threads per core...

    Especially on weak-ass Atom-type cores.

    I mean, I know there's lots of them, but still...
     
  6. ir0nw0lf

    ir0nw0lf [H]ardness Supreme

    Messages:
    6,258
    Joined:
    Feb 7, 2003
    When I first saw MCDRAM I thought of McDonalds. :ROFLMAO:
     
  7. pxc

    pxc [H]ard as it Gets

    Messages:
    33,064
    Joined:
    Oct 22, 2000
    That's how GPU-type cores are made, in order to maximize utilization. The purpose of SMT, both on GPUs and CPUs, it is to hide memory access latency by switching to another thread while it waits for the memory load to complete and become available, or other pipeline stalls. I'm not sure how efficiently Intel's context switching is, but in AMD and Nvidia GPUs, at least while working on a single wavefront or warp (name for the unit of work typically comprising of up to many thousands of threads for AMD and Nvidia GPUs, respectively), it's essentially free to switch out a stalled thread and work on another that is ready.

    As "weak" as those Atom cores are at running full x86 code, they're far faster per core than Nvidia's or AMD's streaming processors for that style of code (essentially semi-random read patterns, branchy code). Those two examples are things that kill performance on classic GPGPU style programming, and compared to using optimized AVX-512 code on the multiple SIMD units per core on that Xeon Phi, it also kills its performance. :p That Cinebench demo is not a typical task for a GPU, but it does show how flexible each core is in running general code fairly efficiently.