Poor scaling in Windows V7 client

fastgeek

[H]ard|DCOTM x4 aka "That Company"
Joined
Jun 6, 2000
Messages
6,520
Yes, yes, I know! Use Linux! With that out of the way. :)

At the moment I'm compelled to have Windows on these servers, so thought while they aren't being used over the weekend I'd have them fold. Ah, now I can feel something else coming on... "Use vbox!" Well, I tried that a while back and didn't have much luck with it. Seeing how it never came close to fully utilizing the processors I simply decided to try sticking with the new V7 client and see how it worked out. Not all that well really. The systems in question are...

Dell R720 with 2x E5-2680 (2.7Ghz)

Dell R820 with 4x E5-4650 (2.7Ghz)

So this should give us a nice apples-to-apples view of how this client handles a 2P and 4P setup. Now with the older client I could get the "baby" 6900 bigadv work units; but these guys only seem to be interested in grabbing 780x's. No matter, we're really after how well they do. So...

R720, Project 7808 (7, 58 , 28) = approx 2m40s TPF

R820, Project 7808 (4, 199, 75) = approx 2m20s TPF

Well shit! That sucks, doesn't it? Especially when you consider that, under Ubuntu with TheKraken running, the R820 could do a 6900 @ 4m30s TPF and the R720 was at 8m20s. THAT is the kind of scaling I was hoping to see; roughly 90% faster. (No idea if that number is right or not, but you get the idea)

Any thoughts as to why this is going so poorly? The processors are all pegged @ 100%. As an aside, I set the 820 to smp64 and it did much worse than leaving it at -1 (which assigned it as smp32)... that was kind of odd.

Anyway... sorry so long. Hope someone finds it of interest.

Regards,
fastgeek
 
Try running 2x clients on the 4P

See what that does.
 
Hey fastgeek there have not been many bigadv WU's for Windows for a while if you have v7 set up correctly it should get one every once in a while though. The 780x WU's are among the largest of the a4 smp WU's I believe the completed WU file is something like 25MB and in my opinion were never benchmarked properly to begin with they are among the lowest producers.

As far as scaling goes Windows SMP does not do as well as Linux but is fairly close just your luck of the draw on the 780x WU's sucks. Hopefully you will get some better ones. To get any available 6900 WU's you will need the following flags, and I only get 1 about every 4 or 5 days on my Gulftowns. Any way hope it helps and good luck with the WU draw

PS Kendrak I doubt that will do any good at least it did not for me.

client_type
bigadv

max_packet_size
big
 
Kendrak - an absolutely great idea! :D Seems to be working just fine too. Thankfully I was able to muddle my way through adding another client and they're both running at ~2m20s TPF. Now any idea why the system isn't getting those "Windows BA" units? Or did that go the way of the Dodo bird with v7?

*edit*

OK, Grandpa, I'll take a look at all that. Did add bigadv to the core settings; but since they're A4 cores I can now see why that didn't work. Thanks! Oh, is it supposed to be hyphens or underscores separating the words? Have seen them both ways and would like to get it right before I leave. Guess I should take a look at the official v7 page too. :p

*edit2*

Seems you can input it either way; but the program changes it to hyphens... and that's what's displayed on this page too.
 
Last edited:
Well I have to take that back. Maybe I did something wrong, but after running for a while the TPF jumped up to 7-10m, so dumped that whole thing. Thanks to Grandpa01 I made the correct changes to the clients and now they both have 6900's. No idea how the 720/820 times will work out; but my thanks to both of you for helping! :)

This really does remind me that I need to make a folding LiveCD or something. The systems I use all have more than enough memory for a RAMDisk. Hell, the R820 has 128GB for gods sake. Now maybe that's not very much to some of you; but it's the most I've ever used personally. :p
 
No, that's a bunch of RAM.

It is as much memory as the SSD on my mian rig :p
 
Much as this pains me to admit... I missed something rather important. Forgot that hyper-threading / virtual cores were disabled for another test on the 4P... which of course meant that Windows was only seeing 32 of the 64 cores. D'Oh! Moron! However with all 64t running the times still weren't all that great. So on that system I have two slots running, assigned each core to their own pair of CPU's and they both seem to be around the same times per update. I've been here WAY too late messing around with this, so just going to hope all stays well.

That being said, doing 6900's under Windows, even with the V7 client, really does suck. The 2P box is clocking in at 10m30s TPF, whereas under Linux/TK it was at 8m20s as previously mentioned. I think the 4P is going to be even worse at ~13M TPF for each slot. :(

So, if anyone can point me to a nice guide that will allow one to get the absolute most out of a V-Box vm with Ubuntu, please let me know and I'll try to get some results early next week... assuming these systems don't need to go back to other testing duties.

Thanks! :)
 
Your PPD will be less running 2 WU's on 1 rig that is what the QRB is all about the points curve insures it will make more PPD running a single client and having a quicker return time. ;)

Edit to add vbox
As far as Vbox goes just install it and configure it to use 64 cores (Not sure of how many cores it supports) and plenty of memory https://www.virtualbox.org/wiki/Downloads then use muskys guide for installing FAH and the Kraken http://hardforum.com/showthread.php?t=1601608
 
Last edited:
Yes, but even after setting things the time per update in the client still wasn't much better than the 2P system was doing. I will play with it more on Monday, but that's what I seem to recall. :) Maybe Windows, or thetime FAH Windows client, simply can't deal with this one process using all 64the threads? From a pure PPD point of view I wish Ubuntu was back in these boxes; but from a messing around POV it's slightly interesting. :-p Heck, now that I've corrected my earlier bone headed mistake I'll let the systems grab a 780x and see how they compare. (With just one slot running all cores)

See, in a way this is also testing in general for my work to see how these systems scale I was different scenarios; hence having the same speed of processors. Under Ubuntu the 4P really shines when it comes to FAH; but obviously not under Windows. Will be interesting to see if other tests return similar results.
 
Whilst the scaling does suck, those tpf's on 7808/09 aren't that bad. Just for reference my L5640 box under windows has a tpf of 5m28s
 
Back
Top