AMD Bulldozer / FX-8150 Gameplay Performance Review @ HardOCP

dirt-3-eyefinity.png


deus-ex-eyefinity.png


who said fx 8150 will be better with surround? :rolleyes:
 
Strange request, but when you guys do the second round of gaming benchmarks, is there anyway to disable the cores so you can run it at 1 core per module for a test? I'm just curious how that'd effect gaming performance, given basically no games use 8 cores properly and people are reporting (http://www.overclock.net/15288816-post1604.html) that the module sharing penalty is pretty big and worth avoiding if you're not actively using more than the 4 cores.

It's not all that practical, but my nerdness is curious what actually happens. I have my doubts dropping 4 cores will really help at all, but I'm intrigued anyway.
 
I wonder if scheduler issues are causing the poor single threaded results. If the win7 scheduler is throwing OS threads and single thread app threads at the same module (but different cores), the fpu will be shared instead of able to work solely on the single threads from the app, basically cutting fpu performance in half.

Could be why in some instances the single threaded performance looks poor and why the win8 results on anandtech showed some decent improvements.

It might be worth doing some fpu intensive single thread tests in win7 vs. win8 to see what happens with the scheduler improvements.

Either way, no big step forward but it is definitely an improvement over the phenom II x6 in multi threaded stuff, which is good. If I were building a new system now, it'd be tough to decide as overall a BD system will be cheaper but will only get 1 upgrade before a new socket (likely same as intel at this point). I'm still gpu limited, so won't bother but will be interested in what happens a year or so down the road. This looks like a good base for AMD if they can crank up the speeds (which looks like they've been able to do to a degree) and if they can improve the ipc for piledriver as they're claiming, that could be the hot ticket.
 
Strange request, but when you guys do the second round of gaming benchmarks, is there anyway to disable the cores so you can run it at 1 core per module for a test? I'm just curious how that'd effect gaming performance, given basically no games use 8 cores properly and people are reporting (http://www.overclock.net/15288816-post1604.html) that the module sharing penalty is pretty big and worth avoiding if you're not actively using more than the 4 cores.

It's not all that practical, but my nerdness is curious what actually happens. I have my doubts dropping 4 cores will really help at all, but I'm intrigued anyway.

Do we know if there's a way to tell which cores are on the same module? IE is it 1/2 in module 1 and 3/4 in module 2? Or could it be something like 1/3 module 1, 2/4 module 2 etc.?
 
I guess that answers the question, 20-30% performance increase in single threaded (and presumably up to 4 threads) workloads if module sharing is turned off. I wonder if it would make sense to just disable the 2nd int. unit on all of the modules for everyday usage and then re-enable for massively threaded (>6 threads) workloads? Sounds to me like AMD need to talk msoft into updating win7 to make it only use 4 int units and full fpus unless more than 6 threads start going through. technically 4 core at 35% boost would equal 6 cores with sharing going on, so that would be the smart switchover point (unless it was intelligent enough to just share on individual modules, ie 5 threads =3 'fat' cores and 1 shared, 6 threads = 2 'fat' and 2 shared, 7 = 1 'fat' and 3 shared, >7 = all shared). It may also depend on how quickly the modules can enable shared resources...if there is latency involved in turning on and off shared resources, it could make the turn on point higher or cycle based (8 threads for at least 1 second or something).
 
Anand had a short comparison that showed Bulldozer could run up to 10% better in the beta version of Windows 8. Maybe with an OS that plays better with their unusual architecture would unlock this chip's full potential but it still wouldn't be enough to put it on an even keel with Sandy Bridge.
 
How long are we supposed to wait around for theoretical gains that may our may not happen?
 
I guess that answers the question, 20-30% performance increase in single threaded (and presumably up to 4 threads) workloads if module sharing is turned off.

So does that mean the whole module sharing idea is a failure, or is it just poor implementation?
 
Here's a review that with the use of affinity, it improved the performance considerably in some benchmarks while in others remained the same, showing that Windows 7 schedulilng isn't optimal for Turdozer. This is a IPC comparison between Turdozer and Deneb at similar clocks.

http://www./forum/hardware-canucks-reviews/47155-amd-bulldozer-fx-8150-processor-review-3.html

Another article that states the same thing. According to AMD, Windows 7 has some scheduling issues, I don't know if a driver can fix it.

"AMD also shared with us that Windows 7 isn't really all that optimized for Bulldozer. Given AMD's unique multi-core module architecture, the OS scheduler needs to know when to place threads on a single module (with shared caches) vs. on separate modules with dedicated caches. Windows 7's scheduler isn't aware of Bulldozer's architecture and as a result sort of places threads wherever it sees fit, regardless of optimal placement. Windows 8 is expected to correct this, however given the short lead time on Bulldozer reviews we weren't able to do much experimenting with Windows 8 performance on the platform. There's also the fact that Windows 8 isn't expected out until the end of next year, at which point we'll likely see an upgraded successor to Bulldozer."

http://www.anandtech.com/show/4955/the-bulldozer-review-amd-fx8150-tested/11
 
Considering the long development time they have with bulldozer, I find it really curious that they did not work with Microsoft with optimizations.
I think they should have released improved version of their current architecture instead of BD. No matter how future proof their new architecture might well be, it does not matter if it does not perform well on today's software. People will not buy the product simply based on the hope that it would perform much better years later.
 
Methinks it is convenient that AMD so quickly found someone else to blame for their disappointing performance.
 
So does that mean the whole module sharing idea is a failure, or is it just poor implementation?

It looks to me like the scheduler has to support it to work. If properly supported, it looks to provide slightly higher performance in single or lightly threaded apps compared to a phenomII x6 and 25-30% more performance in highly threaded apps. That's not a bad jump. The X6 would perform better only in cases where there were 5 or 6 simultaneous threads (not more, not less) as the BD would have to share 1 or 2 modules to handle those workloads, but the advantage wouldn't be that much.

I think not having support in the win7 scheduler is a failure - hell, just changing priority from 1,2,3,4,5,6,7,8 to 1,3,5,7,2,4,6,8 should give improvements as you'd only start module sharing once you go over 4 threads.

Then again, I'm not sure how much it costs to switch a module from shared to unshared, so constant switching of module states could slow down vs. forcing shared or unshared only. The win8 beta benchmarks indicate it's pretty good when the scheduler supports it, but we need a real, full on review comparing the various chips under win8 to know how much it helps BD.

Either way, it's not going to outright beat hyper threaded sandy bridge, so in that case it's a failure. In the case of moving forward from the phenom II, it's not (IMO).
 
You obviously have no idea what you are doing. A 1090T runs WoW just fine. Hell, a Pentium-D would run WoW just fine.

Thats not true. A pentium d 3.6 ghz with geforce 285 gtx plays wow and rift like shit. However a 3.6ghz sempron 130 does quite well on wow and decent at rift. Sempron beat the pentium d in cpu benchmarks by about 2% even stock clocked 2.6ghz and it was loçked single core. I used both systems for bitcoin mining.
 
so after waiting four years for it to come out they want you to wait longer for code optimizations = FAIL
 
I am so disapointed. competition? Guess Ill have to wait on ARM in 2020.

I have never seen so many excuses on performance then with this... Well if I build a VM datacenter at $/W then there's a discussion. But really, why the fuck is anyone building this platform.
 
Hello, and thanks for the great reading material, wanted to ask if testing of Eyefinity and other games is under the works as i enjoy the reviews here. If so - can you give a little teaser (was there some pleasant/unpleasant surprise so far ) ? :)
 
Well, of course I'm disappointed that AMD didn't kick Intel's ass. However, much like the Phenom I, the B3 stepping will probably fix quite a few issues. No miracles likely, but we may see some improvements, judging by how much there is to fix.

Meanwhile, I'm still fine with my Phenom II quad at 3.6GHz: I've got graphics cards to upgrade (waiting for 7000-series Radeons/600-series nVidia), so I'm sitting this CPU upgrade cycle out. I'll hold out until either the B3 stepping of Bulldozer (if it somehow shows dramatic, if unlikely improvement) or Piledriver benchmarks before I upgrade. Judging by the preliminary Battlefield 3 numbers, it looks like pretty well any quad-core cpu will adequately drive that game, so long as you have serious graphics power. That's where I'm focussing my money for the next 3-6 months at this point.
 
Well, of course I'm disappointed that AMD didn't kick Intel's ass. However, much like the Phenom I, the B3 stepping will probably fix quite a few issues. No miracles likely, but we may see some improvements, judging by how much there is to fix.

It won't. There is nothing to fix - this is a issue with the architecture itself. It's like waiting for a fixed Radeon HD2900XT, which never came. Maybe Piledriver will be to Bulldozer what HD3870/4870 was to HD2900XT, maybe not. But waiting for some "magic B3 revision" is pointless. Get yourself a Phenom II X6 while you can, or use Intel.
 
It won't. There is nothing to fix - this is a issue with the architecture itself. It's like waiting for a fixed Radeon HD2900XT, which never came. Maybe Piledriver will be to Bulldozer what HD3870/4870 was to HD2900XT, maybe not. But waiting for some "magic B3 revision" is pointless. Get yourself a Phenom II X6 while you can, or use Intel.

Actually, there's loads to fix. Even if none of the fixes change the IPC one iota, that CPU seems to be bleeding power right left and centre. A new stepping can fix a lot (or even most) of the power-bleeding gates, and at a minimum, allow lower power consumption, and even possibly allow for better, higher overclocking. Look at the Phenom I 9850 B3 stepping (a chip in my other system), which is able to overclock to 3.1GHz easily with a stock cooler from 2.5GHz - something the B2s couldn't pull off. The current, power-bleeding Bulldozer can O/C to ~5GHz. Imagine even exactly the same CPU, but with most of it's bad power gates fixed: we could be looking at going to 5.5 or 6GHz.

However, despite this, I am almost certainly no going to buy a Bulldozer-based CPU. My current mobo is socket AM3 (not AM3+), and I'm still running DDR2 ram. My next upgrade will either be a next generation Sandy Bridge, or a Piledriver (depending on benchmarks). Your suggestion to grab a Phenom II X6 isn't a bad one, but for most games, it's not going to boost my performance by much, if at all, and frankly, my quad core Phenom II is plenty fast enough for everything else I do apart from games, so it'd kind of be $170 that I could have better put towards GPUs. I'm going to get a pair of 6950s or possibly next generation Radeons before I upgrade my CPU, since that's by far the bigger bottleneck for the upcoming games.
 
Last edited:
You didn't get it - there is nothing to fix in Bulldozer as is. The fix is maybe what we call "Piledriver". There won't be a "new revision of Bulldozer". Phenom got a new revision because it had a error which was significant for usage - a blocking issue. High power consumption or low performance aren't a blocking issue.

And what is your super overclocking worth, when your 5GHz 8-core FX 8150 equals to 4GHz 6-core X6 1100T in performance ?
 
You didn't get it - there is nothing to fix in Bulldozer as is. The fix is maybe what we call "Piledriver". There won't be a "new revision of Bulldozer". Phenom got a new revision because it had a error which was significant for usage - a blocking issue. High power consumption or low performance aren't a blocking issue.

And what is your super overclocking worth, when your 5GHz 8-core FX 8150 equals to 4GHz 6-core X6 1100T in performance ?

I completely got it. I told you exactly what I was thinking. The current architecture is not optimal for varied workloads, most especially not single-threaded or floating point calculations. AMD obviously spent most of its time optimizing this chip for server workloads because that's where the money is. Clearly, they didn't have the resources to work on a separate, desktop optimized version. Furthermore, the emphasis on multi-threaded integer performance while sharing the FPU bewteen two integer units demonstrates that AMD doesn't think the FPU in CPUs is long for this world, and I have to agree. When you see how fast a GPU runs single precision floating point calculations, you'd think twice about bothering to continue to waste CPU die space with ever-larger x87 FPU units, too. Basically, even a Radeon 5000-series or nVidia GTX400-series GPU will beat the crap out of an Intel i7 2600K or even a i7 980 when it comes to floating point calculations. The writing is on the wall, and AMD is moving in that direction. Any switch over in computing like this is going to be moderately painful (look at moving from 32 to 64 bit), but at least AMD is leading the way, as they often have been doing since the K7, while Intel sits back and watches, waiting to copy AMD once they look like they've gotten the forumla right.

It is obvious to me that AMD intends these CPUs for servers, and that they further expect those servers to be loaded with dedicated GPUs to deliver the floating point side of the equation. The latest Cray supercomputer design win is proof of this (http://www.tgdaily.com/hardware-features/58984-cray-plans-massive-supercomputer-upgrade). It is also obvious to me that the next incarnations of this CPU (Piledriver) will be coming with built-in GPU circuitry that is really intended to provide the floating point power formerly provided by x87 FPU units, and not really so much to merely provide built-in graphics capability only. For better or worse (and it's almost a no brainer that it's for the better), AMD is moving forward with CPU/GPU integration at a breakneck pace.

I don't know where you think you're going with this. I've already said I'm disappointed with the first FX incarnation. If you didn't read that, I can't help you out. Frankly, I'd like to see Jen-Hsun Huang swallow at least half of his ego and allow AMD and nVidia merge, so they can stop wrestling with each other while Intel plots both of their destructions, and combine resources to produce class-leading products all the way from the dual-core/quad-core ARM space, to competitive APUs for the home/desktop/workstations all the way up to the very highest end x86 server APUs. This would free up AMDs Radeon people to help out AMDs CPU engineers, and nVidia could continue to focus on ARM and GPU products. However, this is such an obvious, fabulous, and logical thing to do that it is unlikely to happen.
 
Last edited:
I reacted to "However, much like the Phenom I, the B3 stepping will probably fix quite a few issues.", where i simply don't see such "stepping" or "revision" comming, which would fix the subpar single thread performance. And even if such things comes sometimes durring Q2/2012, it will be already too late for AMD, because their opponents won't be Sandy Bridge, but Ivy Bridge and Sandy Bridge-E.

Even now, the best they can do at high overclock is to level up with non-HT Sandy bridge CPU at stock in real world use case scenarios. The single core design they put in Bulldozer is simply flawed, and can't be just "fixed in stepping". Unless you call Piledriver as a "stepping".
 
You're absolutely right, Rome wasn't built in a day. But AMD took 4 years to build a pile of shit CPU :(

Those benchmarks are absolutely embarassing. If this is meant for server performance, I'd like to see some SQL #'s and maybe web transaction #'s. But in the end, this is a bomb. Such a shame

Yes, it's such a hot, steaming pile of shit that the world's foremost supercomputer manufacturer is building the world's fastest computer out of them...

http://www.google.com/hostednews/af...docId=CNG.cdfc5e5c632edb4ea036976d050b02b9.e1

Give me a break. Optimized for older, single-threaded, workstation workloads, no. A piece of shit as a result? Hardly. Yes it falls short in many benchmarks we had hoped it would excel, but it really is a very radical departure. It's a little like picking up an assault rifle and saying "gee, this club isn't any better than my old club." When some AVX-optimizations are added to many programs that can benefit from them, the current FX chips won't look so weak, and that doesn't require Windows 8.

Piledriver and all its fixes and optimizations have already been brought dramatically forward for an early 2012 launch. The new AMD CEO has done what he needed to do to make the best of a bad situation. He hit the ground running, and made a hard call. If he hadn't allowed the current FX chips to come out, AMD would have been looking at a serious revenue shortfall this quarter, and VERY pissed off industry partners (motherboard makers, supercomputer and other manufacturers). That's all this is, a hiccup that will soon be remedied.
 
I reacted to "However, much like the Phenom I, the B3 stepping will probably fix quite a few issues.", where i simply don't see such "stepping" or "revision" comming, which would fix the subpar single thread performance. And even if such things comes sometimes durring Q2/2012, it will be already too late for AMD, because their opponents won't be Sandy Bridge, but Ivy Bridge and Sandy Bridge-E.

Even now, the best they can do at high overclock is to level up with non-HT Sandy bridge CPU at stock in real world use case scenarios. The single core design they put in Bulldozer is simply flawed, and can't be just "fixed in stepping". Unless you call Piledriver as a "stepping".

I stand by my original logic that a new stepping would allow many of the power-leakage issues to be resolved. Sure, this wouldn't change the IPC of the design, but it would allow higher clock rates at the same power envelope. However, you allusion to Piledriver is probably more likely. AMD almost certainly has already begun wide sampling of the greatly improved Piledriver CPU/APU and is likely to bring forward its release, rather than focussing on a stepping update for the existing Bulldozer. That's what I would do if I were AMD's CEO in any case.
 
So do any online retailers actually have this chip in stock? The 8120 seems to be somewhat available. I've seen some people suggest the quality and OC potential on the 8150 is greater. Any truth to this or is it simply a factory OC?
 
Back
Top