Linux Application Scaling Featuring 128 Threads from Dual AMD EPYC 7601 Processors

Discussion in '[H]ard|OCP Front Page News' started by cageymaru, Oct 11, 2018.

  1. cageymaru

    cageymaru [H]ard|News

    Messages:
    19,464
    Joined:
    Apr 10, 2003
    Michael Larabel of Phoronix has conducted a test of Linux application scaling on up to 128 threads. He chose a Dell PowerEdge R7425 server featuring two AMD EPYC 7601 processors for a total of 64 cores and 128 threads, 512GB of RAM (16 x 32GB DDR4), and 20 x 500GB Samsung 860 EVO SSDs. He ran the server through a battery of testing to see how Linux applications and benchmarks scaled from 2 threads to 128 threads. He has even more results here. If you are curious about how much AMD EPYC processors cost in relation to Intel Xeon processors, the AMD EPYC Processor Selector Tool is found here.

    This Dell PowerEdge server packing two AMD EPYC 7601 processors can build the Linux kernel in just 25 seconds!
     
    Red Falcon, schmide and DrezKill like this.
  2. seanreisk

    seanreisk Gawd

    Messages:
    615
    Joined:
    Aug 29, 2011
    Those NAMD scores are almost perfect, but they should be since charm is all about threading. But NAMD is such a "which came first, the chicken or the egg?" piece of software because you almost need an advanced degree just to learn how to use it.
     
  3. mashie

    mashie Mawd Gawd

    Messages:
    4,165
    Joined:
    Oct 25, 2000
    But can it play Crysis?
     
  4. Sikkyu

    Sikkyu I Question Reality

    Messages:
    2,967
    Joined:
    Jan 21, 2010
    we need a downvote option
     
  5. gxp500

    gxp500 Gawd

    Messages:
    803
    Joined:
    Mar 4, 2015
    Crysis isn't multi threaded so no.
     
    Krazy925 likes this.
  6. naib

    naib [H]ard|Gawd

    Messages:
    1,140
    Joined:
    Jul 26, 2013
    So windows with intel chip. :)
     
  7. drescherjm

    drescherjm [H]ardForum Junkie

    Messages:
    14,087
    Joined:
    Nov 19, 2008

    And I thought my Ryzen 2700 was quick at that compared to the E3 Xeons I have at work..

    Being a Gentoo user I have done several thousand kernel compiles.
     
    JosiahBradley and AceGoober like this.
  8. DrBorg

    DrBorg Gawd

    Messages:
    543
    Joined:
    Jan 22, 2005
    Yes, it is.

    It was added in patch 1.2, iirc.

    It still won't play with every eyecandy turned on any faster than ~60fps, on most systems today.

    I still play Crysis, and CrysisWars.

    And yes, you Can play online without Gamespy.
     
    Krazy925 likes this.
  9. Absalom

    Absalom Limp Gawd

    Messages:
    496
    Joined:
    Oct 3, 2007
    It can, but it won't. Kings and nobles have little concern for the peasant.
     
  10. Elf_Boy

    Elf_Boy 2[H]4U

    Messages:
    2,176
    Joined:
    Nov 16, 2007
    Will WOW not be CPU bound?
     
    Sulphademus likes this.
  11. cyclone3d

    cyclone3d [H]ardForum Junkie

    Messages:
    12,747
    Joined:
    Aug 16, 2004
    Yawn... wake me up when they test 2^32-1 thread scaling. Still waiting for that as my primes number program supports that many threads.

    I'll change one variable to make it be able to do 2^64-1 threads once we get that high.. heh.
     
  12. cyclone3d

    cyclone3d [H]ardForum Junkie

    Messages:
    12,747
    Joined:
    Aug 16, 2004
    Nope.. because even though WoW is multithreaded, it still does a majority of the work on 1 thread as far as I know.

    Have they even changed the default threading support or do you still have to manually configure it in a configuration file? If not, then they are being dumb. Changing the default over 10 years ago helped immensely. I don't see why they wouldn't have changed it by now.
     
  13. gxp500

    gxp500 Gawd

    Messages:
    803
    Joined:
    Mar 4, 2015
    How many cores does it saturate?
     
  14. Nobu

    Nobu 2[H]4U

    Messages:
    2,292
    Joined:
    Jun 7, 2007
    Two, but it gets them really moist!
     
  15. Mazzspeed

    Mazzspeed [H]ard|Gawd

    Messages:
    1,206
    Joined:
    Dec 27, 2017
    You always have a master thread....
     
  16. cyclone3d

    cyclone3d [H]ardForum Junkie

    Messages:
    12,747
    Joined:
    Aug 16, 2004
    I know that. But I would think they could break it apart more than they do. Maybe breaking it apart more than they do has negative effects due to threads having to sync up.
     
  17. ManofGod

    ManofGod [H]ardForum Junkie

    Messages:
    9,580
    Joined:
    Oct 4, 2007
    No, not always, just for now. Otherwise, let gets real, things will not move forward at some point and that is a bad thing.
     
    DrBorg likes this.
  18. DrBorg

    DrBorg Gawd

    Messages:
    543
    Joined:
    Jan 22, 2005
    That depends on what you're doing.

    But, at least 6.

    Video card makes the most difference; 2x7970's will go wild trying to keep up. :)

    When the Vega's get cheap, I'll see if two of those helps. :D
     
  19. TType85

    TType85 [H]ard|Gawd

    Messages:
    1,426
    Joined:
    Jul 8, 2001
    I think it auto-configures now but I don't think it does a good job. I have a gaming VM (8c/16t from a 1950x, GTX1070) that I play WOW on. On Windows 10 it will use multiple cores but 1 core is always 90%+. On a OSX VM running on the same machine (VM with 8c/16t, GTX1070) I see much more even core usage, 25-35% on each core. Frame rate wise I get higher lows in the new BFA capitals on the OSX VM but higher highs in the open world on the windows one. Running at 1440p, settings at 7 I am seeing 30-45 in the Alliance town and 130+ in the open world on the windows VM and 35-50 and 100 max on the osx vm (frame limiter off)
     
  20. cyclone3d

    cyclone3d [H]ardForum Junkie

    Messages:
    12,747
    Joined:
    Aug 16, 2004
    Well.. yeah, I hope it changes for games. Some tasks are just serial by nature.

    Some tasks, such as a primes number generator are very easy to split out so the work load is pretty much exactly even over all the threads.. and without ANY syncing having to go on in between threads. No thread locks except for the one thread that only does monitoring of the other threads to detect when all of the other threads are finished before it spits out the results.
     
    drescherjm likes this.
  21. Elf_Boy

    Elf_Boy 2[H]4U

    Messages:
    2,176
    Joined:
    Nov 16, 2007
    How many threads does wow support now?

    Haven't played the new xpac yet.
     
  22. ManofGod

    ManofGod [H]ardForum Junkie

    Messages:
    9,580
    Joined:
    Oct 4, 2007
    Gaming itself is not serial in nature, it is that it has been that way out of nessecity. Most gaming machines had single core CPUs for at least 20 years and therefore, the code base was built with that in mind.
     
  23. BSmith

    BSmith [H]ard|Gawd

    Messages:
    1,240
    Joined:
    Nov 9, 2017
    Games, for the most part, are a poor application for threading, simply due to how games work. They are event driven. Sure, you can create a hundred threads, but then if they are all waiting on a single event to start a cascade of events, then what is the point?

    The game loop already runs asynchronously in any decent game. Sound is already threaded at the OS driver level. Inputs are all threaded. You can thread artwork all you like, but there is only one bus to shove the data down, or one memory path to store the data so each thread would be locked waiting for others to finish. Those context switches add more overhead, by the way.

    Most of the things that need to be threaded are al ready threaded. Sure, you can have a couple more threads, but nothing like a video editor/compilier could have.
     
    Sulphademus, drescherjm and Absalom like this.
  24. Sulphademus

    Sulphademus Limp Gawd

    Messages:
    289
    Joined:
    Mar 18, 2010
    So all these tests, 2 4 8 16 32 64, are all real cores and then for 128 they turn SMT on? Did I understand that correctly?
     
  25. mashie

    mashie Mawd Gawd

    Messages:
    4,165
    Joined:
    Oct 25, 2000
    Correct.
     
  26. BloodyIron

    BloodyIron 2[H]4U

    Messages:
    3,119
    Joined:
    Jul 11, 2005
    Not surprised here. The Linux kernel is literally written to handle thousands of cores due to the HPC/Super Computers it's used in. We just happen to be able to leverage that work because it's open sourced ;)
     
  27. Mazzspeed

    Mazzspeed [H]ard|Gawd

    Messages:
    1,206
    Joined:
    Dec 27, 2017

    Attached Files:

    Last edited: Oct 12, 2018
  28. Elf_Boy

    Elf_Boy 2[H]4U

    Messages:
    2,176
    Joined:
    Nov 16, 2007
    Morowind is still 1 thread.

    Who is up for updating the engine?

    Wish Bethesda would remaster it.

    Will the new Elder Scrolls finally be fully threaded?

    I have wondered, but dont know enough about coding to think it out myself, if the initial game load could be better managed or threaded. My experiance, which matches things I have read on [H] and elsewhere, is that games on NVMEx4 storage barely load faster then SATA based fast storage. In game, after the initial load I see much better performance with NVME vs SATA 3 flash storage, in game teleports, moving quickly to an area with a crap ton of textures to be loaded etc.

    I dont think till NVME these was a reason to improve game load times or do anything different.
     
  29. Nobu

    Nobu 2[H]4U

    Messages:
    2,292
    Joined:
    Jun 7, 2007
    http://openmw.org/
     
    Frobozz likes this.
  30. BSmith

    BSmith [H]ard|Gawd

    Messages:
    1,240
    Joined:
    Nov 9, 2017
    The problem with most initial game loads are the developer got lazy and took the cheap way out by mallocing space for each resource instead of one maloc for all of it and them managing the resources themselves. That one thing can speed the game load time by many factors.

    The only reason game performance would be better from an SSD vs a hard disk is the coding is sloppy. Sloppy in terms of managing its resources. What they should be doing is running a thread which preloads what is going to be needed, rather than waiting to load it when they need it. Then the performance would not be tied to a storage system. Not hard to do, but takes some time to do it right.