OS lockup/freeze vs. floating-point exception: What's the difference?

Joined
Apr 3, 2008
Messages
43
When I was overclocking on air around 3.53GHz (443x8) and I tried to push the system to be stable at 445x8, I would get floating-point exception errors in one of my clients.

Now on water, I'm stable at around 433x9, but when I push the clocks too high I get an OS lockup (Xubuntu 7.10), but when I check my FAH log I have no floating-point exceptions or otherwise have no folding related errors.

Both errors disrupt folding, and require a restart and some clock adjustment in the BIOS. My question is, why do you think I'm getting these OS lockups at higher frequencies, and what if anything can be done to counter them? (I seem to be capable at folding at these higher frequencies without errors--aside from the system lockups of course). Do you think the memory bus is over-saturated? Would switching to a different Linux distro do the trick?
 

Axdrenalin

[H]ard|DCer of the Month - Nov. 2009
Joined
Jan 28, 2004
Messages
6,226
Whats the chipset you're running on? Is it a new P35 or an older 965?

 
Joined
Apr 3, 2008
Messages
43
It's a P35 chipset (MSI neo2-fr P35 mobo). What do you think it could be? The same thing happens in XP and Xubuntu, but not with Notfred. I'd use Notfred if only it gave me 2605s like I'm able to get in Xubuntu and XP via VMs.

Thanks for your reply, this is driving me crazy. No folding errors, are showing up in the logs, just a total freeze in Xubuntu or a restart in XP, and it folds at 440x9 (3.96GHz) in Notfred.

Basically I think the CPU itself is more stable than 433x9, and it seems to be some memory or chipset problem...
 

Axdrenalin

[H]ard|DCer of the Month - Nov. 2009
Joined
Jan 28, 2004
Messages
6,226
I had a similar issue a couple months ago, and it turned out to be the memory crapping out in one box. Another box I had (965 chipset) was doing something similar and I ended up bringing the fsb down just a bit more, and its been stable ever since. I not sure why you can fold at that speed with notfreds and not be able to do it using a standard OS, though. It really does sound more like a hardware issue to me though. What kind of power supply are you using?

 
Joined
Apr 3, 2008
Messages
43
I have a pretty decent PSU, by my estimation. It is a 600W OCZ StealthXstream.

Would adding more RAM give me more memory bandwidth--perhaps that's where the crash is originating?

Right now I have 2x 1GB Gskill F2 DDR2-800 sticks @ 1:1 = 866MHz
 

Axdrenalin

[H]ard|DCer of the Month - Nov. 2009
Joined
Jan 28, 2004
Messages
6,226
Not necessarily, a couple gigs ought to be plenty. Do you use auto or manually adjust the RAM timings in your BIOS? Then again, that still doesn't explain why you can run the diskless client at higher clocks and not with an OS??? :confused:

Anyone else got any ideas on this? Possible HD errors, maybe? I'd drop the ratio on that RAM down to around the 800 or lower mark, maybe 2:1 ratio for starters, just to make sure the RAM is okay.

 
Joined
Apr 3, 2008
Messages
43
Also the board has a lot of copper, with two good size heatsinks, 1 over the voltage regulators, another on the NB, with three heatpipes between the two. I have fans drawing air off of each, 120mm and 80mm respectively.
 
Joined
Apr 3, 2008
Messages
43
I just formatted my HD and installed Xubuntu 7.10 yesterday night. (8.04 yields 100 PPD less, no clue why)

I'll try upping the RAM voltages, there at a modest 2.1V and then run MemTest86, and see if its stable at that speed.
 

Axdrenalin

[H]ard|DCer of the Month - Nov. 2009
Joined
Jan 28, 2004
Messages
6,226
I'm running 8.04 on my VMs and not getting any issues at all with lockups or freezes. Running a total of four VMs on two machines, and one standalone box. Wouldn't think it was the OS.

 

Sunin

[H]ard|DCer of the Month - August 2008
Joined
Dec 27, 2005
Messages
3,421
Well your next step is to drop the multiplier on the processor to 6 or 7 and jack the FSB up and see if it is stable if yes then it is not your mobo.

If not stable check out the FSB related voltages and tap them up .1 (NB, FSB, etc)

If it was stable then put back the multi and get CPUID and watch the volts to your CPU... up the voltage some until either your temps get to hot or on a Quad you hit 1.5v to the CPU... I never exceed that #. Also make sure you turn off things like EIST in the bios, or Thermal Control, etc. Thos can introduce instability as well. Of course use Ortho or Prime95 to stress test with each adjustment.

Good luck.

 
Top