Most CPU utilization meters for Core i7 are flawed... (solution inside)

gtg465x · Mar 2, 2009

So I was sitting here today and it suddenly dawned on me that most the CPU utilization meters I've seen for i7 are flawed, and the ones that do get it right don't have pretty bar charts. Most of them show utilization for 8 separate threads (which they mistakenly call cores), which is fine if you want a thread meter, but what if you want to see the utilization of each core as a whole as well as thread utilization?

Does anyone else see the problem here or is it just me?

If you agree with me, you can try this Vista sidebar gadget, a modification of mCPU Meter, I created today.

The two threads on each core are a different color.

Download

edit: Warning: we have concluded that the gadget I created is not technically accurate.

TehQuick · Mar 2, 2009

lol, how do you know that 2 virtual threads = one physical core? AFAIK, HT means that CPU functions like 8 virtual cores, all equal, not like 4 cores + 4 lesser HT threads.

gtg465x · Mar 2, 2009

TehQuick said:
lol, how do you know that 2 virtual threads = one physical core?

http://en.wikipedia.org/wiki/Hyper-threading

TehQuick said:
AFAIK, HT means that CPU functions like 8 virtual cores, all equal, not like 4 cores + 4 lesser HT threads.

I know that all threads are equal, and that's how my gadget is programmed. If a thread (logical core) is at 100%, it will use exactly 50% of that physical core. For example, logical cores 1 and 5 correspond to the physical core 1. In the above picture, logical core 1 is at 100% utilization and logical core 5 is at 40% utilization, meaning physical core 1 is at 70% utilization.

TehQuick · Mar 2, 2009

Well, again how do you know which threads reside on which core?

gtg465x · Mar 2, 2009

TehQuick said:
Well, again how do you know which threads reside on which core?

It was quite simple really. I used CoreTemp to monitor the utilization of the physical cores and ran an instance of Prime95 on one thread at a time.

Logical 0 and 4 correspond to physical 0.
Logical 1 and 5 correspond to physical 1.
Logical 2 and 6 correspond to physical 2.
Logical 3 and 7 correspond to physical 3.

NoNRG · Mar 2, 2009

It sounds like you know your stuff gtg. I do not know as much about the Comp Sci world as I would like (I'm an EE), nor do I have a Core i7 build to try this on atm (built one for my dad just recently though), but it looks like you got it right.

Is that a screenshot of the CPU meter running on your comp....? How many virtual machines are you running man!

Nenu · Mar 2, 2009

gtg465x said:
I know that all threads are equal, ...easured only by how long it accesses the CPU?

gtg465x · Mar 2, 2009

Nenu said:
I'm not so sure it can be viewed in this way, see what you think.

A single virtual thread can 100% occupy a CPU.
2 virtual threads can 100% occupy a CPU, but how can you tell what the actual CPU load is of each thread is?
The loading of the virtual cores may not represent CPU load.

ie
Virtual thread 1 may be occupied 100% of the time but using 30% CPU.
Virtual thread 2 may be occupied 55% of the time and using 70% CPU.

That is a bit extreme but I wonder if the virtual thread load and CPU load are directly proportional always or if the difference can only ever be minor/negligible.
Is virtual thread load measured only by how long it accesses the CPU?

I don't think so. If a thread is at 100%, I believe whichever physical CPU the thread is running on will be at 50%. If you can prove me wrong, by all means show me a screenshot of CoreTemp (or some other program that measures physical core utilization) showing 100% on a core when only one thread is at 100%. When I pushed a single thread to 100% with Prime95, CoreTemp showed the corresponding physical core at 50%.

gtg465x · Mar 2, 2009

More sample data. If you don't believe me, I suggest installing the gadget and running some tests for yourself before trying to prove me wrong. I could very well be wrong, but I want some hard numbers/screenshots to prove it, not just speculation.

Nenu · Mar 2, 2009

I dont have an i7 rig.
Still really happy on an E8400, my day will come though

Its really good you are doing this, I am very interested to see how it pans out.

edit:
Its interesting that it only shows 50% CPU use when one virtual thread is maxed.
It looks like CPU usage may be measured as both threads occupancy added together.
This doesnt make sense as one thread can occupy the CPU entirely, or can it??
Still, like you I'm open to learning more.

gtg465x · Mar 2, 2009

Nenu said:
I dont have an i7 rig.
Still really happy on an E8400, my day will come though
Its really good you are doing this, I am very interested to see how it pans out.

Don't get me wrong, I'm no genius

, and I could very well be wrong about something, but all the testing I've done has supported my theory.

gtg465x · Mar 2, 2009

After some reading, here's my grossly oversimplified understanding of it...

Each logical core is only allocated 50% of a physical core, but a single threaded application can still run at full speed because the instructions will be divvied out to 2 (or more) logical cores. This can be observed by opening a single threaded application like SuperPi or a single instance of Prime95. They are split across multiple logical cores.

Honestly, some of these documents I've been reading are confusing as hell.

Nenu · Mar 2, 2009

I see, so it probably looks at pipeline occupation on the CPU itself.
Your app should correctly report CPU use, nice one!

Chilly · Mar 2, 2009

Your right about the way the information is displayed BUT your wrong about how virtual threads are used. If a virtual thread is pinged at 100%, it is COMPLETELY possible for the WHOLE CORE to be in use, even if the other virtual thread tied to that CPU is at 0% utilization.

What hyper threading does is use missed opportunity in the pipeline to do work, because of this its possible to fully max out a CORE even if the other virtual thread is not in use. HT takes advantage of the fact that the way software executes isn't perfect, and slots extra work during down time.

As for the "single threaded app being split among more than one thread" thats due to the way the OS manages the application(trying to optimize its performance) and has NOTHING to do with the APP or CPU.

Your on the right track, you just need to understand that a fully used thread with the other idle does not mean only 50$ of the CPU is being utilized, but closer to 70-80%. Its not going to be easy(if its even possible) it actually show the PROPER utilization of each core with *CURRENT* monitoring tools.

gtg465x · Mar 2, 2009

Chilly said:
Your right about the way the information is displayed BUT your wrong about how virtual threads are used. If a virtual thread is pinged at 100%, it is COMPLETELY possible for the WHOLE CORE to be in use, even if the other virtual thread tied to that CPU is at 0% utilization.

What hyper threading does is use missed opportunity in the pipeline to do work, because of this its possible to fully max out a CORE even if the other virtual thread is not in use. HT takes advantage of the fact that the way software executes isn't perfect, and slots extra work during down time.

As for the "single threaded app being split among more than one thread" thats due to the way the OS manages the application(trying to optimize its performance) and has NOTHING to do with the APP or CPU.

Your on the right track, you just need to understand that a fully used thread with the other idle does not mean only 50$ of the CPU is being utilized, but closer to 70-80%. Its not going to be easy(if its even possible) it actually show the PROPER utilization of each core with *CURRENT* monitoring tools.

Can you explain the picture in post #8?

gtg465x · Mar 2, 2009

You're right Chilly and Nenu. I've confirmed your explanation with further testing. I was able to calculate that, when forced to run on a single thread, SuperPi used 69% of the processing power of that physical core. The load information displayed by CoreTemp is simply not technically accurate.

So my originally theory that CPU utilization meters for Core i7 are flawed still stands, but my solution does not. For now, we'll just have to watch thread utilization without actually knowing how much of each core is being utilized.

Nenu · Mar 2, 2009

gtg465x · Mar 2, 2009

Anyone think this is a good way to display it? I'm basically just grouping the two logical core displays together on their respective physical core.

InvisiBill · Mar 2, 2009

gtg465x said:
Anyone think this is a good way to display it? I'm basically just grouping the two logical core displays together on their respective physical core.

I think that looks pretty good. Since we don't currently have a perfectly accurate way to represent this, that view at least gives you an idea what's going on per thread and per core.

My solution is just to run dnetc so that I'm always at 100%. =)

Chilly · Mar 3, 2009

gtg465x said:
Anyone think this is a good way to display it? I'm basically just grouping the two logical core displays together on their respective physical core.

This looks like a great compromise!

GenTarkin · Mar 4, 2009

yeah its interesting though, because that updated graph that you posted above still does not really show what % a physical core is being used.
Lets take core 0 for example.
That graph could be showing that the physical core is being used 100% or it could only be used (based on looking at the sheer bars) about 70%.

The reason this is because of the nature of HT...your graph is still based on showing it in a manner of there being a 50/50 physical partition on the one physical core.
As pointed out earlier in the thread HT is dynamic. If a thread requests 100% of the core it will give it nearly 100% of the core. If another thread comes along with same priority and wants 100% also then HT will drop both threads to around 50/50 of physical core usage.
But, if a thread uses 100% of a "logical cpu" and another thread of lesser priority comes in and wants to run, HT will put that thread at say 25% usage and the first thread may go to 75% actual usage but still showing 100% usage in task manager or your graph.
See core 0(physical core) in your picture could actually be being utilized 100% even though it only shows the picture at 90% thread one and 35% thread 2(which appears to be roughly only 70% physical core usage). Cuz in this case, HT may actually be working in a manner where the "partition" on the physical core is something in the ballpark of a 80% (thread 1) 20% (thread 2) physical core usage.

Its hard for me to explain so it makes complete sense. But hopefully Im making some sort of sense =).

Evidence can be seen if one runs LinX and plays around with telling one instance of it to run on only cores 0-3 or doing cores 0, 1, 4, 5.
In the first case of cores 0-3 linx times will be half of what they will be vs doing cores 0 1 4 5. Thats simply because on cores 0-3 you are running all 4 threads on the physical cores @ 100%. If HT really cut each core in half then the times would not be half the results in setting linx to run on cores 0 1 4 5. Because on 0 1 4 5 you are making HT dedicate half of the first 2 physical cores of the processor.

So, really there is no way to make a program that can monitor the actual physical usage of each physical core revolving around monitoring the cpu the way windows task manger does. It would have to be something that does like low level hardware monitoring on the CPU.

ilkhan · Mar 4, 2009

As far as I know the execution units (the actual "core") are dynamically allocated, if one logical core is maxed and the other is idling, the execution units are all in use.

GenTarkin · Mar 4, 2009

yeah that is what I was trying to say above =)

Most CPU utilization meters for Core i7 are flawed... (solution inside)

2[H]4U

2[H]4U

2[H]4U

2[H]4U

2[H]4U

2[H]4U

[H]ardened

2[H]4U

2[H]4U

[H]ardened

2[H]4U

2[H]4U

[H]ardened

[H]ard|Gawd

2[H]4U

2[H]4U

[H]ardened

2[H]4U

2[H]4U

[H]ard|Gawd

Limp Gawd

[H]F Junkie

Limp Gawd