CPU Utilization is Wrong

Discussion in 'HardForum Tech News' started by rgMekanic, Apr 30, 2018.

  1. rgMekanic

    rgMekanic [H]ard|News Staff Member

    Messages:
    3,725
    Joined:
    May 13, 2013
    In May of last year, senior performance architect at Netflix, Brendan Gregg posted an interesting article about how the "%CPU" metric is wrong, and is progressively getting worse. Now, Brendan expands on his findings in a 5 minute video from the Southern California Linux Expo. The UpSCALE Lightning Talk from Opensource.com goes over his original idea very well, but also shows an interesting conclusion.

    Check out the video

    In his Lightning Talk, "CPU Utilization is Wrong," Brendan explains what CPU utilization means—and doesn't mean—about performance and shares the open source tools he uses to identify reasons for bottlenecks and tune Netflix's systems. He also includes a mysterious case study that's relevant to everyone in 2018.
     
    Aenra likes this.
  2. naib

    naib [H]ard|Gawd

    Messages:
    1,253
    Joined:
    Jul 26, 2013
    I thought everyone knew this?
    If my linux load is high but my temperature isn't increasing I am IO-blocked.
     
  3. cyclone3d

    cyclone3d [H]ardForum Junkie

    Messages:
    12,893
    Joined:
    Aug 16, 2004
    Sooo.. the whole point of the video was to show that the Meltdown and Spectre patches are causing a 26% slowdown because they flush the TLB cache thus causing a massive number of cache misses.

    But instead of just getting to the point, the speaker just said that CPU Utilization is incorrect...... :rolleyes:
     
  4. Shmee

    Shmee [H]ard|Gawd

    Messages:
    1,148
    Joined:
    Sep 12, 2014
    But it is incorrect.
     
    Vercinaigh likes this.
  5. dgingeri

    dgingeri 2[H]4U

    Messages:
    2,830
    Joined:
    Dec 5, 2004
    Pretty straightforward. It's pretty direct that cache misses cause an increase in what the metric measures as CPU utilization when that is not in fact the case. Good point. The specific case he uses is specific to Intel CPUs and the Meltdown patches, but it is applicable to CPUs in general waiting on main memory reads, which happens all the time.
     
    captaindiptoad likes this.
  6. Ski

    Ski Gawd

    Messages:
    946
    Joined:
    Jun 21, 2008
    4 in 10 Americans think the earth is less than 10,000 years old so when you apply that same level of tech ignorance then it becomes abundantly clear that not everyone knew this.
     
  7. SvenBent

    SvenBent 2[H]4U

    Messages:
    2,791
    Joined:
    Sep 13, 2008
    its complete correct. You just have to understand what it is you are measuring and stop confussing it with something you believe it is.
     
    defaultluser likes this.
  8. Maveric79111

    Maveric79111 n00b

    Messages:
    3
    Joined:
    Jun 6, 2017
    I've been saying this for almost 10 years... Why did I not present on this so many years ago? This isn't smart stuff to me, its duh stuff! This occurs with memory usage too and I should have presented and wrote a white paper on this.
     
  9. tissimo

    tissimo Limp Gawd

    Messages:
    318
    Joined:
    Dec 8, 2007
    Yup, why game 100% loads produce different temps than say Prime or IBT 100% loads.
     
  10. bigdogchris

    bigdogchris [H]ard as it Gets

    Messages:
    17,787
    Joined:
    Feb 19, 2008
    And why you can pull up Aida benchmark and still browse the web like nothing is running even though it says 100%.
     
  11. ecuador

    ecuador Limp Gawd

    Messages:
    206
    Joined:
    Dec 29, 2008
    I guess Netflix could switch to AMD CPUs? :) Just sayin'....
     
    rgMekanic likes this.
  12. Outsideloop Computers

    Outsideloop Computers n00b

    Messages:
    4
    Joined:
    May 19, 2013
    Yeah, new title for the presentation: "Why Netflix is going Epyc"
     
    defaultluser, knowom and Shmee like this.
  13. bitbum

    bitbum Gawd

    Messages:
    514
    Joined:
    Mar 10, 2003
    If you follow Brendans work you'll know that hes been doing this kind of analysis for years. Usually it has to do with spending inordinate cycles executing a particular code block. The result of the inefficient code produces similar results as an inefficient processor. Not so surprising.
     
    PantherBlitz likes this.
  14. TordanGow

    TordanGow Gawd

    Messages:
    834
    Joined:
    May 25, 2015
    For one thing anyone that is managing large scale server deployments knows about this. Second, if only we had a way to measure external delays like I/O wait... oh what's that, we already do?

    I'm a dope and even I know this.
     
  15. craigdt

    craigdt Gawd

    Messages:
    1,015
    Joined:
    Oct 27, 2016
    Seems like it's been about that long since we had a meaningful chip advancement from Intel.



    I'll see myself out.
     
  16. ecktt

    ecktt Limp Gawd

    Messages:
    413
    Joined:
    Oct 22, 2004
    Wow. He wrote all those tools and still screwed the pooch on a process waiting on input causing erroneous CPU utilization. Yup, it's a Linux piece. No wonder. Yeah, these guys need to go back and read "Operating Systems" by the legendary Andrew S Tanenbaum. Instead of saying a Context Switches gets more expensive, he makes an absolutely wrong statement about misleading CPU utilization.
     
  17. velusip

    velusip [H]ard|Gawd

    Messages:
    1,577
    Joined:
    Jan 24, 2005
    Well, the CPU utilization metric is correct if you plan on using it to throttle similarly bottlenecked code.
     
  18. the_real_7

    the_real_7 [H]ard|Gawd

    Messages:
    1,183
    Joined:
    Sep 10, 2007
    I skipped the meltdown and spectre patches and bios updates .I haven't seen any incident yet on my clients pc are any of mine. These patches do more harm than help, you can see that in benches.
     
  19. naib

    naib [H]ard|Gawd

    Messages:
    1,253
    Joined:
    Jul 26, 2013
    Well there is that :)
    I guess what I meant was every Linux user knows (or should know) this. These are Linux tools being demonstrated and any sysadmin needs to know how to track down io bound tasks


    Windows is different as MS doesn't make this obvious.

    Fundamentally this isn't CPU loading this is task scheduler loading. A CPU is always working, aspects maybe unlocked to save power BUT it is still used.
    An OS scheduler however is different
     
  20. 1_rick

    1_rick Limp Gawd

    Messages:
    384
    Joined:
    Feb 7, 2017
    Sure, but his presentation is like an ambush episode of Jerry Springer, and then instead of any kind of conclusion, he just does a mic drop.
     
  21. katanaD

    katanaD [H]ard|Gawd

    Messages:
    1,987
    Joined:
    Nov 15, 2016

    im pretty sure its closer to 2 in 5 americans...
     
  22. gtrguy

    gtrguy Limp Gawd

    Messages:
    145
    Joined:
    Oct 8, 2009
    No man, it's definitely 40%... :p
     
  23. PaulP

    PaulP Gawd

    Messages:
    776
    Joined:
    Oct 31, 2016
    Have a source for that stat, or did you just make it up?
     
  24. Sulphademus

    Sulphademus Limp Gawd

    Messages:
    314
    Joined:
    Mar 18, 2010
    Why is this "idle" process using up so much of my CPU time??!
     
  25. John721

    John721 [H]ard|Gawd

    Messages:
    1,622
    Joined:
    Mar 8, 2006
    http://bfy.tw/HwPo

    Sadly, it seems to be the case.
     
  26. PaulP

    PaulP Gawd

    Messages:
    776
    Joined:
    Oct 31, 2016
    All I could find was a bullshit Gallop poll that gave people only three choices: evolution with God's help, evolution without God's help, and creation 10,000 years ago. That leaves out a lot of people that believe in creationism but have other ideas on the timeline, including the idea that the Earth is millions of years (or more) old. These people will not select the first two answers, so get lumped in with the "young earth" creationists. So really all that poll proves is that 40% of the people in this country believe in creationism. I'll bet that percentage is much higher in Muslim countries. Does that make them backwards and stupid too?
     
  27. Dan_D

    Dan_D [H]ardOCP Motherboard Editor

    Messages:
    53,241
    Joined:
    Feb 9, 2002
    Again, this backs up any statement that more or less says: "Most people are idiots."
     
  28. Ski

    Ski Gawd

    Messages:
    946
    Joined:
    Jun 21, 2008
    giphy.gif
     
  29. xorbe

    xorbe [H]ardness Supreme

    Messages:
    5,975
    Joined:
    Sep 26, 2008
    100% utilization doesn't mean 100% max load. It just means the kernel scheduler had something to run other than idle the cpu thread resource. It's always been a pita to quantify cpu and memory usage, everyone wants to know something slightly different.
     
  30. ecuador

    ecuador Limp Gawd

    Messages:
    206
    Joined:
    Dec 29, 2008
    No, you did not have to choose only one of those three, that's why the combined percentage is not 100%. The category you are describing would obviously choose "none of the above", which seems to be a 5%.