I have some problems with my new Xeon 4P rig, need some help?

It is very annoying that one must reset the rig between each WU to achieve the best possible PPD. If I do not end up TPF nearly 10 minutes frequently.

I also discovered something strange for project 8101 the download file size varies with over 6mill bytes, see here:

HTML:
 P8101 06.09.2012  DL 11:53:11
 [11:53:11] Connecting to http://128.143.231.201:8080/
 [11:53:20] Posted data.
 [11:53:20] Initial: 0000; - Receiving payload (expected size: 30306354)
 [11:54:13] - Downloaded at ~558 kB/s
 [11:54:13] - Averaged speed for that direction ~679 kB/s
 [11:54:13] + Received work.
 [11:54:13] Trying to send all finished work units
 [11:54:13] Project: 8101 (Run 4, Clone 1, Gen 87)
 

 

 P8101 07.09.2012  DL 13:23:21
 [13:23:21] Connecting to http://128.143.231.201:8080/
 [13:23:27] Posted data.
 [13:23:27] Initial: 0000; - Receiving payload (expected size: 24114924)
 [13:24:04] - Downloaded at ~636 kB/s
 [13:24:04] - Averaged speed for that direction ~670 kB/s
 [13:24:04] + Received work.
 [13:24:04] Trying to send all finished work units
 [13:24:04] Project: 8101 (Run 24, Clone 5, Gen 22)
 
P8101 has suddenly risen the TPF to 10:31 after every new WU. When I reboot it stabilizes at 9:33, which is also very poor. Is there anyone else with 4P E5-4650 rigs experiencing this? It is many days ago the TPF was down under 9:10.:D

Update: After 10% TPF is now 9:38 and rising!
 
Last edited:
P8101 has suddenly risen the TPF to 10:31 after every new WU. When I reboot it stabilizes at 9:33, which is also very poor. Is there anyone else with 4P E5-4650 rigs experiencing this? It is many days ago the TPF was down under 9:10.:D

Update: After 10% TPF is now 9:38 and rising!

Leave it alone for a week and look at the tpf averages then.

You are poking and prodding it far too much to draw valid conclusions...

WUs can have variations in speed... make sure you are not just freaking out over nothing.

What distro flavor and version are you running?
Any updates, tweaks?
wrappers...

Something does look wrong... but I think its time to step back take a breather and come back at it fresh.
 
I agree with Patriot here, you also have to remember that p8101 has a pretty big variance in WU's compared to most other projects.
 
Thanks Patriot!
The question I was asking was:
Is there anyone else with 4P E5-4650 rigs experiencing this? If anyone else are experiencing the same, then I am not worried, and as I can read between the lines, this is normal, so again, I am not worried.

The only tweaking I use is the kraken, I am not poking with anything. The Ubuntu version is 11.10 Kernel Linux 3.0.0-24 Generic.

Leave it alone for a week and look at the tpf averages then.

You are poking and prodding it far too much to draw valid conclusions...

WUs can have variations in speed... make sure you are not just freaking out over nothing.

What distro flavor and version are you running?
Any updates, tweaks?
wrappers...

Something does look wrong... but I think its time to step back take a breather and come back at it fresh.
 
Thanks Patriot!
The question I was asking was:
Is there anyone else with 4P E5-4650 rigs experiencing this? If anyone else are experiencing the same, then I am not worried, and as I can read between the lines, this is normal, so again, I am not worried.

The only tweaking I use is the kraken, I am not poking with anything. The Ubuntu version is 11.10 Kernel Linux 3.0.0-24 Generic.

Some fluctuation is normal... 30s to 1m minutes is not... But take a break from fidgiting with it... Rebooting in middle of wu is considered dicking around...

Your TPF does seem high....but you can't count the "lowered" initial tpf following a reboot as valid...that normal wu stop start behavior and cannot be trusted.
As is your low numbers only touched my non-turbo tpf...
 
Last edited:
A variance from 9:30 to 9:10 is normal for 8101 WUs. Some people found this might be related with the "Run number" of the WU, those 8101 WUs with ultra small "Run number" are usually faster. In my 4P 4650's test I also noticed it. I have achieved an avg TPF of 8:47 (min TPF=8:44) with a fast 8101 WU of "Run 0", and an avg TPF of 9:04 with a slow 8101 WU of "Run 17". These two are the fastest/slowest 8101 WU I have met.

However, obviously it's abnormal that TPF is as slow as 10:31 for every new WU and only reset could recover it back. But I did know some 2P E5's users also experienced this problem. I think this is an interesting problem and would look into it further if I meet it on my own rig.


P8101 has suddenly risen the TPF to 10:31 after every new WU. When I reboot it stabilizes at 9:33, which is also very poor. Is there anyone else with 4P E5-4650 rigs experiencing this? It is many days ago the TPF was down under 9:10.:D

Update: After 10% TPF is now 9:38 and rising!
 
Last edited:
A variance from 9:30 to 9:10 is normal for 8101 WUs. Some people found this might be related with the "Run number" of the WU, those 8101 WUs with ultra small "Run number" are usually faster. In my 4P 4650's test I also noticed it. I have achieved an avg TPF of 8:47 (min TPF=8:44) with a fast 8101 WU of "Run 0", and an avg TPF of 9:04 with a slow 8101 WU of "Run 17". These two are the fastest/slowest 8101 WU I have met.

However, obviously it's abnormal that TPF is as slow as 10:31 for every new WU and only reset could recover it back. But I did know some 2P E5's users also experienced this problem. I think this is an interesting problem and would look into it further if I meet it on my own rig.

nice tpf... I never recorded what I was running after I got turbo running... but I was doing 9:07 avg before hand.
 
In recent days, only 8102 has been coming down, and this WU is very much more stable than the 8101. TPF varies between 7:10 and 7:20, but is more often closer to 7:10. I do not longer have to reset the computer as I have to do with the 8101 WU.
 
I have some problems with my new Xeon 4P rig. Mobo is SMX9QRi-F+ and 4 x E5 4650 ES C1 stepping QBEC, and they are all recognized and working perfect in socket1 and socket2. I have tried with 16 x 4G of this memory http://www.corsair.com/en/memory-by...annel-ddr3-memory-kit-cml16gx3m4a1600c9b.html and 16 x 1GB of this http://www.kingston.com/datasheets/KVR1333D3N9_1G.pdf but there is no change.... .
Thanks alias for coming to my aid elsewhere. Has that problem that you described above been completely resolved?
 
Yes it has, when I change to SAMSUNG DDR3 4GB 1333Mhz ECC Reg., Dual Rank (M393B5270CH0) memory the problem was solved, and the rig is now working 100%.

Thanks alias for coming to my aid elsewhere. Has that problem that you described above been completely resolved?
 
I am posting here as I hope this may help someone else to make a decision about a 4P E-5 box. Thanks Alias
Reading this thread got me thinking that a 4P E5 setup might suit me better than another 4P G34.
So buttocks clenched I ordered a X9QRi-F+, 4X ES 4650 CPU’s C0 stepping,
8 sticks of Kingston KVR16R11S4/4HC Reg RAM a AX1200i PSU, two 8 pin to 2X8pin adapters and 4X Evo 212’s Coolers. $ less than 4k.;)

I received all parts, assembled and was able to boot into SM screen but not into bios.
As it turns out, my guess that 2 sticks of Ram per CPU in the (1) slots would boot proved wrong.
After an hour of trying various combinations I got the machine to boot with all eight sticks in CPU 1.
The bios recognises all 4 CPU’s and the 32GB RAM.:p

I am now in unknown territory and was able to install windows server 2012 and run fah V7 which is folding on 32 cores
with the CPU usage in task manager bouncing around various cores.
In 3ds max I can render on 64 cores with all at 100% use So the CPU's are working.YES!!.
I can’t install Ubuntu to fold as it runs into many errors and crashes, I am assuming it doesn’t like all RAM on one CPU,
whereas windows can handle it in a maybe performance restricted way.
Knowing the Ram works in this board I have ordered another 8 sticks so I can assemble them as SM recommend in the manual with 4 sticks per CPU.
Hopefully then everything will be running fine and I can post [big] PPD soon.
 
Last edited:
The memory KVR16R11S4/4HC should work fine as I can see the Kingston compability list http://www.ec.kingston.com/ecom/configurator_new/models.asp?System=&Distributor=0&OriginalSysPN=&Manufacturer=&KTCPartNo=KVR16R11S4%2F4HC&root=&LinkBack=&submit1=Click+Here

Running with 2 x memory for every CPU should be fine too. Are all the CPUs exactly the same stepping, or are there some differences?

And also, if you have not done so, I would recommend resetting cmos for at last 30 second when the mobo is off power cables.
 
Last edited:
You will need to specify -smp 64 in the client - it default to a max of -smp 32, which is what you are seeing. As far as how to do it in the v7 client, I have no idea, but I am sure someone else can chime in.

I haven't heard of an issue with Ubuntu and these types of setups - several folks run them now. You are going to have to be more specific than "many errors and crashes" for assistance in getting it to work. It is very much worth your time getting Linux of some sort running on such a machine, though.
 
So here's my setup:

X9QRi-F+
Xigmatek Elysium
16x4GB Samsung M393B5170FH0-CH9
Corsair AX1200i
EVGA G210
4xE5-4650 C0 stepping
160GB Seagate HDD

I just finished assembling it, but I'm not getting any video. I'm currently trying to use the HDMI output of the G210.

Does the mobo not recognize PCIe devices the first time its booted, meaning I'd have to plug into the onboard VGA connector for setup? That would be a bit of an issue since I don't have the backplate connectors actually lined up with the backplate area of the case due to where I drilled the new standoff holes (lining that up would have resulted in the top 3 standoff locations being over empty air, which didn't seem like a good idea). The VGA connection is somewhere around the topmost PCIe hole in the case, and I cannot physically get a connector to it. I suppose I could go buy a handheld power saw or something and try to cut out some parts of the back, but I'd prefer not to do that.

I haven't actually done any of the standard troubleshooting steps yet (remove all memory, remove all but CPU1, etc.) since it's relatively late and I just got it assembled. I have the 210 in PCIe slot 4, the topmost one, in case that makes any difference. When I power on I get a quick lone beep, which I guess is the "ready to boot" message. After several seconds, I get a series of beeps that doesn't seem to correspond to their limited list of error codes. It sounds sort of like

beep........beepbeepbeepBEEP

where that last one is higher-pitched than the others. The first beep is rather short, there's just a pause between it and the rest. There may be an additional beep in there; the set after the pause is very rapid.

I noticed that before the beep code a number of the LEDs between PCIe slots 2 and 3 are amber. Don't know what that means, it may be normal. Also, when the computer is off but is still getting power, there's a rather bright orange LED on behind PCIe slot 2. I really wish SM included any sort of LED guide in the manual. It's already 109 pages long, was it really going to be that much worse to include more troubleshooting material?

I have all of the memory in the slots between the CPUs, as that's what the manual indicated to do. I was a bit disconcerted over the fact that the manual didn't have a listing of 4 CPUs and just 16 sticks; it only lists 4 CPUs and 18-32 sticks. This will run with just 16, yes?
 
The beeps looks (read sounds) just like my rig, so I think it is just the video signal that is not going to the G210. The backplate does not fit whatever you do with the plate, so do not use it. I use this box for all my G34 motherboard and I have to make access to the internal VGA by removing what ever is in the way. And, it is allways a good ide to test the rig on a wooden plate before you assemble it in the box. Anyway, try to get access to the internal video card and I am sure the signal will come on the screen.

As for the memory, all the blue slots should be populated for the use of 4-channel memory, is the easyest way
to explain it.

Update: It is also possible that your EVGA G210 will work if you disable the internal VGA on JPG1, see page 1-5 in the manual. The manual can be find at http://www.supermicro.com/products/motherboard/Xeon/C600/X9QRi-F_.cfm

BTW: I should like to see a picture of how the motherboard fit in the Xigmatek Elysium?
 
Last edited:
I'll try those suggestions after work. I just realized that I put half the memory in the wrong slots, at least going by what the manual says, but I trust your experience more, in which case I still have half the sticks in the wrong slots, just a different half.

I'll get a picture up once it's running. There's not a whole lot of extra room, that's for sure. I have the mobo positioned about as close to the front of the case as possible without actually touching the HDD cages and such; there's about a quarter inch gap between the rear edge of the board and the back of the case. I didn't think I'd be able to line the PCIe slots up with the openings in the case, so I figured having some space there would be useful somehow.
 
Major success! :eek::D

Turns out the video was the only problem. I disabled the onboard video via the jumper and everything worked from there on out, no need to cut out pieces of the case. It apparently doesn't even care that the memory is not in the ideal locations. I still have all of the slots located between the sockets populated while none of the slots outside have any ram. Would there be any speedup if I changed the arrangement?

I think there's performance improvements to be had, because looking at other people's numbers I'm definitely behind. I'm getting around 9:20 TPF on an 8101 (shakes fist at them for the very first time:mad:) at what I believe is 3.1 GHz. I did install thekraken, and it appears to be working seeing as how running top shows thekraken-FAHCore (or however it's printed) taking up ~6400% CPU time. NUMA is enabled. I have EIST enabled and the Turbo option in the BIOS; is that all there is to getting it to Boost all the time? I can't run the ocng clockspeed command for some reason, and /proc/cpuinfo is telling me 2700 MHz, but I'm thinking it's just reading the expected stock speed. I had previously disabled EIST and was seeing 10:30-10:45 frame times and lower temperatures (before: ~42-48 C depending on processor, now all are in the mid to high 50s), so I feel like I did get that working. What did I miss here? I am running 1333 MHz ram, so if any of those faster figures were with 1600, that could explain some of the difference.

Oh yeah, here's a picture:
img20130409211513583.jpg

. It wasn't running because I don't like putting on or removing the side panel when it's powered up and I didn't feel like turning it on and off again just for a picture. I had to bend the tab at the bottom of the GPU's bracket since it's supposed to go in that little groove at the back of the case, but needless to say it was nowhere near it and was instead hitting the mobo tray. It's not really secured by anything other than the PCIe slot itself, but it's a pretty light card and I'm not expecting it to pull itself out. I also know my cabling leaves a lot to be desired; this case has a ludicrous amount of cabling/WC holes in the tray, but almost all of them are hidden by the monstrosity that is this motherboard. The two fans on top and one on the bottom are Cooler Master Megaflow 200's, with the bottom one pushing and the top two pulling.

I was actually able to do a semi-credible job of drilling and tapping the standoff holes. I think I have 10 of the 13 mounts secured, with an 11th just a bit too far off-center for the screw to get to it; the other two are over empty air. Of course, half of the standoffs that are secured aren't well-centered as it is, but they're close enough that the screws could catch them at a bit of an angle. I'm a pretty terrible driller, so the fact that I only failed horribly at one of the holes is a minor achievement, and I don't feel like it's in danger of falling out.

If anyone else ends up using this case, make sure to mount the hard drive before putting the mobo in. There's a couple of cages that hold the drives, but they only come out by pulling them towards the back of the case. With this mobo in there, the bottom cage just barely can't make it out (ended up using some manual dexterity to get the drive mounted in it) and the top one has absolutely no chance of going anywhere. It also meant I couldn't snip the LED wires on the front-mounted fans; I really don't like LEDs and cut them where I can, but these were impossible to get to without removing the entire mobo.
 
Major success! :eek::D

Congratulations on your new rig and a very well done job of putting the hardware together. I think probably that it's the population of memory that not allows TPF to get better than 9:20 for 8101. I guess that it is the memory that only acts as 2-channel instead of 4-channel memory. The difference in TPF between 2 and 4 channel memory is approx. 30 seconds in my experience.

You can check the frequency by a script made ​​of tear, see down below on how it can happen.

The right value should be 3.1GHz for your rig.
You could check it by using a frequency-check tool like turbostat, i7z, or by a script provided by tear:
http://darkswarm.org/freqcheck.sh

Install the script, and run it by:

sudo modprobe msr
sudo modprobe cpuid
sudo ./freqcheck.sh 2700

And the output result should be something like this:
CPU0: 3099 MHz
CPU1: 3099 MHz
CPU2: 3099 MHz
CPU3: 3099 MHz
CPU4: 3099 MHz
CPU5: 3099 MHz
CPU6: 3099 MHz
CPU7: 3099 MHz
CPU8: 3099 MHz
CPU9: 3099 MHz
CPU10: 3098 MHz
CPU11: 3099 MHz
CPU12: 3099 MHz
CPU13: 3097 MHz
CPU14: 3099 MHz
CPU15: 3099 MHz

I can see that it is a bit cramped in the box, but it is the same in my $1200 Supermicro box, and if I had built one like this again today I would choose the Zigmatek or built it on a plate of wood.

At the moment my Xeon E5-4650 rig is folding at P8101 with a TPF of 8:54 as you can see down under. The best I have seen for 8101 on my rig is 8:46. For some reasons I do not know, the TPF varies from WU to WU for P8101 with 10 - 12 seconds all the time.


Uploaded with ImageShack.us

Just out of curiosity, how much power does it pulls from the wall?
 
Last edited:
It's bouncing between 730 and 750W at 3.1 GHz; I removed the Kill-A-Watt before running it at 2.7 (which was unintentional anyway). I'll repopulate the memory tonight.

I tried running freqcheck, but it told me it was obsoleted and that I should use ocng, which in itself didn't work (I don't remember the failure message of either program off the top of my head). Is there a way to force freqcheck to run anyway? I'll run turbostat as well to see if that gives me an accurate reading.
 
I am no expert and I can only what I learn from other members on this forum. I guess tear or one of the other guys read this and will help you with this freq problem. Your power draw is a little high I think. After I rebuild my rig from passive cooling by wind tunnel to active cpu coolers my power draw have gone from 725 down to 685W. My problem was noise from the wind tunnel fans that was 70 - 80dBA. After the rebuild the noise fell to 40dBA, and I can now have the rig indoors.

It was those 6 wind tunnel fans that use the difference in power between 685 and 725W. I am very pleased with rig as it is working now.

Finally the P8104 came down and the rig runs it with a TPF = 5:06 which almost gives a million (996,722) PPD. A TPF = 5:05 will give a PPD = 1001628,3

Uploaded with ImageShack.us

I love this toy.:D
 
I have to say the noise level is very good. I haven't measured it with anything, but it's roughly on par with the dedicated Linux box that it replaced, and that was just a standard mid-tower case with an OC'ed i7-930.

I'm not really sure what I could do about the power draw. I assume you're running off the onboard graphics? The G210 has a max TDP of 30W, which might explain most of my increase. The abundance of fans probably accounts for the rest.
 
Your EVGA G210 card will draw a few extra Watt I think. I am on a 240V power network, is that what you are on also?

I finally reached over a million PPD.


Uploaded with ImageShack.us
 
Last edited:
Yes that explain most of it!

Well, I beleive you are finished with your new build now, and I hope you will get much joy with it. See you later around the forum.

BTW: I live in the north of Norway.
 
Last edited:
I use i7z http://code.google.com/p/i7z/ to check frequency. It will only show 2 nodes at a time, but you can configure which nodes (I use sensors from lm-sensors package to find hottest CPUs, then configure i7z to show those, one stop shopping, frequency and temps).
 
i7z confirmed that it's at 3.1 GHz. I've repopulated the ram and TPF is now down to around 8:55.
 
That is great, and you will almost certainly see TPF as low as 8:46 after a time I am sure of.
 
Did you ever find a way to prevent the TPF from rising on successive WUs other than rebooting? I know there's variation from WU to WU, but on a particular one I was getting 9:40 before reboot and back down to 8:55 after.
 
My rig do exactly the same, not every time but now and then, and it happen only with P8101. I asked in this forum for a possible solution, but there have not been any answer at this time. Some time the TPF can be as high as 10:35, but after a reset it comes down to 9:0x and lower, then after a few % it usually is down to around 8:50 +/- a few second.

The same can occur on my SR-2 Xeon 2P E5645 rig so I think this is a possible Intel CPU problem?
 
I noticed that the 8101 TPF increase only seems to occur when going from a non-8101 to an 8101. At one point I had 3 consecutive 8101's and they all ran at about 8:55 without me rebooting.

I removed the 210 after installing xrdp, so I can now control it remotely from my desktop. The Kill-A-Watt is now fluctuating between 710 and 720 W. Considering the crazy top-tier energy prices in SoCal, it might actually result in semi-noticeable savings.
 
On my rig there is no such systematic behaveor. I remember when there was 8101 all week, it then happen randomly. And for some reason, the power consumption is now fallen to only 665W. I suppose it could possibly be due to the new 8105 which has come down last week.
 
I couldn't get the script to work under Ubuntu 12.04, could someone please advise. I am currently running my X9QRi-F+ in 2P mounted in a SC748TQ chassis, and am waiting on the other processor to arrive next week. Basically, I have the recommended specs as per supermicro. Thanks
 
Yeah, I've obsoleted it in favor of clockspeed utility... thing is, clockspeed doesn't yet work for intel CPUs... I'll try to spend some time on it today :)

In the mean time, feel free to use i7z. It works well with 4P SB-E.
 
Back
Top