AMD’s newest Bulldozer architecture – FX-8120 8Cores performance and OC 5G

Yea, it's not as easy as just having the scheduler recognize them in a chronological order. There has to be a lot of thought put into a win7(8) scheduler for Bulldozer. For AMD and Windows to release a BD optimized scheduler requires taking into account quite a lot.

Off the top of my head:

Both the clock speed gains and future clock speed gains and how the turbo will impact the scheduler
Spitting out like threads within module to optimize performance.
Deciding where to send a lower number of threads
You have to account for the shared FP or 2 128bit separate, and shared L2 and L3 into the above.

It's gonna be difficult and I'm not surprised I don't see one yet. I'd imagine the chronological order one would have been relatively easy to come up with and as would a "module first until over 4" scheduler, but those have obvious downsides. To make one that accounts for those weaknesses that will help the BD (and CMT) architecture won't be easy.

EDIT: you can set thread (well, core) affinity on your own anyway, but that requires knowing how many threads the program uses, and that isn't as easy as it seems. Then there's the hassle of always playing with it depending on how many threads you're CPU is being asked to handle.

Can you turn off specific cores in BD? or are you limited to turning off modules only?

I haven't actually worked with a Bulldozer or Zambezi CPU yet. However since it uses the same motherboards as previous Phenom II's did I'd say that disabling actual cores should be easy to do and supported on almost if not all BD/Zambezi compatible boards.
 
Would be interesting to see, particularly from a benchmark and gaming perspective, especially if you know the task at hand has 4 cores or less and more than 1. Would also be quite funny as well... Congratulations on your new gimped but better performing 4 core processor :D

It'd be cores 0, 2, 4, 6 or 1, 3, 5, 7
 
Would be interesting to see, particularly from a benchmark and gaming perspective, especially if you know the task at hand has 4 cores or less and more than 1. Would also be quite funny as well... Congratulations on your new gimped but better performing 4 core processor :D

It'd be cores 0, 2, 4, 6 or 1, 3, 5, 7

No kidding, you have to cripple it to fix the performance...now you have a standard 4 core cpu...ummm, yeah go AMD! :rolleyes:
 
The one thing I think that would answer is just how much the resource sharing inside of modules is hindering performance. If it'd be a better performer, I'd rather just see BD gimped to 4 cores instead of 8. IMO core count isn't really as relevant as overall performance.

Maybe it would mess with the turbo function, but can't that be disabled anyway and we just set our own speed? From an enthusiast perspective, an 8120 might not be as bad as it currently seems if you could make it closer to a 2500k by disabling 4 "cores", disabling the turbo, and setting your own OC speed around 4.6ghz.
 
You'd have to hit 5.5ghz to reach the IPC of a 2500k at stock clocks.

picc-fourier.gif


picc-fourier.gif


myrimatch.gif


0f represents both cores being utilized within a single module and 55 represents 4 separate modules. So clearly there's a resource-sharing tax that you have to take when tasking the threads within coupled cores rather than separate modules and it's quite hefty. That hefty gain when not coupled, though, doesn't make up for the discrepancy you see between the 2500k/2600k and the 8150s in terms of overall performance. The turbo core is actually one thing that functions very well in Bulldozer and works far better than what Intel currently has. Opting to decrease its significance doesn't seem like a smart move going forward.

You can't manually cripple the design in order to get more out of it because that isn't practical and it simply won't benefit. A new windows scheduler will help, but don't expect any miracles. I was simply stating a hypothetical scenario for benchmarking purposes =P like the one I just posted
 
Haha I'm keeping my expectations well in check though.

1) If one still needs to use Mask 55 for max performance or not.
2) If the performance continues to be improved with Mask 55 - or if it is no longer optimal to do so.

Well it's only been like a day, so realistically it may be a couple days or even a week before someone posts results.
 
Hope this is good news...not a BD fan but maybe it's a small step in the right direction.
 
Haha I'm keeping my expectations well in check though.

1) If one still needs to use Mask 55 for max performance or not.
2) If the performance continues to be improved with Mask 55 - or if it is no longer optimal to do so.

Well it's only been like a day, so realistically it may be a couple days or even a week before someone posts results.

The "Mask 55" thread affinity I don't think will be the optimal setting for the Bulldozer scheduler, despite giving the best results in the benchmarks from TR. Filling 4 modules first before tasking them with another thread is great for up to 4 threads, but you can hit a roadblock depending on the tasks that follow those 4. It's likely to be a bit more complex than that with a mixture of within-module and between-module (0f and 55 respectively).

I'd imagine the tasks between 2 and 4 threads will benefit the most, but overall gains shouldn't be overwhelming and certainly not enough to excuse the pitfalls of the architecture.
 
hey redshirt #24 infos on the main page and i gave u the cred u deserved for posting it first! :p
 
Back
Top