2685 crashed with wierd error

amdgamer

Supreme [H]ardness
Joined
Oct 27, 2004
Messages
4,880
[10:36:07] Completed 210000 out of 250000 steps (84%)
[11:02:02] Completed 212500 out of 250000 steps (85%)
[11:27:55] Completed 215000 out of 250000 steps (86%)
[11:53:44] Completed 217500 out of 250000 steps (87%)
[12:19:38] Completed 220000 out of 250000 steps (88%)
[12:45:31] Completed 222500 out of 250000 steps (89%)
[13:11:23] Completed 225000 out of 250000 steps (90%)
[13:37:19] Completed 227500 out of 250000 steps (91%)
[14:03:14] Completed 230000 out of 250000 steps (92%)
[16:13:42] - Autosending finished units... [August 9 16:13:42 UTC]
[16:13:42] Trying to send all finished work units
[16:13:42] + No unsent completed units remaining.
[16:13:42] - Autosend completed
[17:46:05] CoreStatus = C0000029 (-1073741783)
[17:46:05] Client-core communications error: ERROR 0xc0000029
[17:46:05] Deleting current work unit & continuing...
[17:46:23] Trying to send all finished work units
[17:46:23] + No unsent completed units remaining.
[17:46:23] - Preparing to get new work unit...

Out of curiosity, has anyone seen this before? Do you guys think this is due to a bad work unit, or a bad overclock? I just walked past my Gulftown and noticed that Windows had an error message displayed on the screen that said

FAHCORE_A3.exe has stopped working

It apparantly happened sometime last night, but windows was still working fine, just the a3 core had stopped working. It just went ahead and downloaded the same work unit again(the exact same one).
 

APOLLO

[H]ard|DCer of the Month - March 2009
Joined
Sep 17, 2000
Messages
9,089
Sorry to see this. It happens to me on occasion when the temps or the OC is too high. There are other possible causes as well. I would shut down the client before closing the error message box. There's a better chance you could have saved the WU.
 

musky

[H]ard|DCer of the Year 2012
Joined
Dec 14, 2009
Messages
3,154
FYI, if you see the Windows message that FAH has stopped working, do this - click OK, then immediately select your cmd window and hit ctrl-C. If you can stop the client before it knows it crashed, you can save your work unit.

As to why it crashed, no idea. It is usually related to instability due to overclock. I am just guessing here, but I think the fact that the machine did not hang or reboot is a signal that something is wrong with your memory. I got these all the time trying to get some "less than stellar" memory to actually work.
 

amdgamer

Supreme [H]ardness
Joined
Oct 27, 2004
Messages
4,880
Well damnit............


My Gulftown just failed memtest. I had it set to run 3 passes and it failed on the 3rd pass.
 

10e

2[H]4U
Joined
Jul 20, 2006
Messages
3,383
Well damnit............


My Gulftown just failed memtest. I had it set to run 3 passes and it failed on the 3rd pass.

Try one of two things when you get a chance:

1) Lower the memory divider in the BIOS to a lower amount and re-test. If it works on the same test that you just failed then the memory is not the issue.
2) Return to the same memory speed that failed and raise the VTT in the BIOS one notch.

I finally got my I7 980x and Kill-A-Watt today so I'll see what works on my rig to see if I can help. Usually that specific error in BigAdv is related to memory being unstable or defective.

I believe we have the same mobo (GA-X58A-UD3R) but my chip is a 2010 Week 5 batch.
 

amdgamer

Supreme [H]ardness
Joined
Oct 27, 2004
Messages
4,880
Try one of two things when you get a chance:

1) Lower the memory divider in the BIOS to a lower amount and re-test. If it works on the same test that you just failed then the memory is not the issue.
2) Return to the same memory speed that failed and raise the VTT in the BIOS one notch.

I finally got my I7 980x and Kill-A-Watt today so I'll see what works on my rig to see if I can help. Usually that specific error in BigAdv is related to memory being unstable or defective.

I believe we have the same mobo (GA-X58A-UD3R) but my chip is a 2010 Week 5 batch.

My memory is currently running at stock speeds and timings. My ram is rated for DDR3-1600 CL7, except that i'm running at the stock settings of 1333mhz@CL7. I have Vdimm set at 1.66volts. Here is the message that memtest gave me, which is wierd because I thought memtest was supposed to tell you which module failed.

The Windows Memory Diagnostic tested the computer's memory and detected hardware errors. To identify and repair these problems, contact the computer manufacturer
 

APOLLO

[H]ard|DCer of the Month - March 2009
Joined
Sep 17, 2000
Messages
9,089
Here is the message that memtest gave me, which is wierd because I thought memtest was supposed to tell you which module failed.

The Windows Memory Diagnostic tested the computer's memory and detected hardware errors. To identify and repair these problems, contact the computer manufacturer
This is a new system. If you do have a defective module it should still be covered in the guarantee from MC.
 

amdgamer

Supreme [H]ardness
Joined
Oct 27, 2004
Messages
4,880
This is a new system. If you do have a defective module it should still be covered in the guarantee from MC.

Yeah, i'm taking it back to Micro Center right now. This is EXACTLY why I don't do my own system builds anymore. I never have time to fuss with this type of stuff, and such vague and generic error messages like this can take a very long time to pin down what is wrong. I have been doing a search online and it may be a different type of hardware problem other than the memory. From what i'm reading, windows mem diag tells you exactly which module failed when it is a problem with ram. I just got off the phone with them and they will warranty the build including diagnostics for 30 days.
 

Nathan_P

[H]ard DCOTM x3
Joined
Mar 2, 2010
Messages
3,448
I had the same problem with a normal A3 unit a couple of weeks ago, I was like you and contemplating new memory. However i flushed the whole install and started again and no problems since, system was at stock memory timings before and now (DDR2 5-5-5-18 2T 6400). I can only guess that something somewhere causes the memory to glitch and throw an error.

Oh when you get your machine back you may find it will try and download the same unit again - if it does you will need to change the machine ID on the client - as the server was never told it was bad it will reissue the same WU ad infinitum
 

amdgamer

Supreme [H]ardness
Joined
Oct 27, 2004
Messages
4,880
I had the same problem with a normal A3 unit a couple of weeks ago, I was like you and contemplating new memory. However i flushed the whole install and started again and no problems since, system was at stock memory timings before and now (DDR2 5-5-5-18 2T 6400). I can only guess that something somewhere causes the memory to glitch and throw an error.

Oh when you get your machine back you may find it will try and download the same unit again - if it does you will need to change the machine ID on the client - as the server was never told it was bad it will reissue the same WU ad infinitum

The problem is that when I ran the Windows 7 memory diagnostic, it detected a serious hardware problem and failed during pass #3. In the past when i've had WU's crash and flush, i've always ran the mem tool afterwards and would come out just fine.
 

amdgamer

Supreme [H]ardness
Joined
Oct 27, 2004
Messages
4,880
Well, folding2 is at Micro Center now. Just got back home, but somehow I have a feeling that they won't find a problem, or the diagnostic will take a very long time.
 

Nathan_P

[H]ard DCOTM x3
Joined
Mar 2, 2010
Messages
3,448
The problem is that when I ran the Windows 7 memory diagnostic, it detected a serious hardware problem and failed during pass #3. In the past when i've had WU's crash and flush, i've always ran the mem tool afterwards and would come out just fine.

Hmm - not good, at least its new and it should be a straight forward exchange. I'm using XP32 so i just flushed, reinstalled and fired everything up again. I'm too busy trying to get all my GPU's running to faff about with the RAM, if it throws up an issue again then i'll look at it but for now its back to normal 24/7 service:cool:
 

amdgamer

Supreme [H]ardness
Joined
Oct 27, 2004
Messages
4,880
I vote we have contest to give amdgamer's folding rigs better names than folding1 and folding2...:)

LOL......

I'm a simple person and didn't really think too much about the names. Because they all share most of the parts and especially the same case, they do look exactly the same as they sit next to each other on my desk.
 

tjmagneto

[H]ard DCOTM x2
Joined
Aug 6, 2008
Messages
3,281
Just like this?

matrix_agents.jpg
 

amdgamer

Supreme [H]ardness
Joined
Oct 27, 2004
Messages
4,880
Well damn.........they still don't know what is wrong as they havn't been able to cause any more instabilities during diagnostic testing. If they can't find anything wrong, i'm going to just have them change out the ram with the more conservatively timed Corsair DDR3-1600 modules. The price on these were too good to be true as I have no idea how CL7 modules can be $20 cheaper than the CL8 modules.

BTW, what happens if you don't set the uncore multiplier at twice the memory ram multiplier? Could it be that I couldn't get the system working when I set the ram speed at 1600 without fiddling with the uncore multiplier?
 

musky

[H]ard|DCer of the Year 2012
Joined
Dec 14, 2009
Messages
3,154
Gigabyte board? That might be it. Zero will chime in at some point I am sure. He has a lot more experience with Gigabyte boards and overclocking than most.
 

10e

2[H]4U
Joined
Jul 20, 2006
Messages
3,383
The rule is the uncore multiplier should set at minimum twice the RAM speed, so that may be a possibility. But you had the issue at 1333mhz as well yes?

I know the Gigabyte X58A-UD3R has a slow mode of 100mhz for the uncore hard set (ie. independent of BCLK/Bus Speed) but I'm not sure how well that works, 'cause I've never tried it.

Setting it one notch higher posted fine, but failed stability tests for me, and setting it one notch below (if memory serves) caused a post failure.

I have the i7 980x now and will be installing it tonight, so I'll try those settings out, as that's the same board that I'm installing it in as well.

I DO know that I have the RAM at 1632 on this board 100% stable with a BCLK of 204mhz on my I7 920 D0.


Well damn.........they still don't know what is wrong as they havn't been able to cause any more instabilities during diagnostic testing. If they can't find anything wrong, i'm going to just have them change out the ram with the more conservatively timed Corsair DDR3-1600 modules. The price on these were too good to be true as I have no idea how CL7 modules can be $20 cheaper than the CL8 modules.

BTW, what happens if you don't set the uncore multiplier at twice the memory ram multiplier? Could it be that I couldn't get the system working when I set the ram speed at 1600 without fiddling with the uncore multiplier?
 

amdgamer

Supreme [H]ardness
Joined
Oct 27, 2004
Messages
4,880
Gigabyte board? That might be it. Zero will chime in at some point I am sure. He has a lot more experience with Gigabyte boards and overclocking than most.

I dunno man, i've used Gigabyte boards ever since an Asus I had a while back shorted out a ram slot. These have been pretty solid and folding1 is holding up very well. Since I am still within the 30 day return period, I am tempted to return the Gigabyte and switch it for an eVGA board as those have typically been rock solid.

Uncore only needs to be 1.5x or higher on Gulftown.

Hmmm, if that is the case then this isn't the problem. When I switched the multiplier to try to get the system running with the ram running at exactly 1600mhz, the uncore would have been at least 1.5x the ram multiplier anyway.

Micro Center still has been unable to find any problems. I am going through withdrawels after buying a Gulftown and being unable to fold on it :(
 

amdgamer

Supreme [H]ardness
Joined
Oct 27, 2004
Messages
4,880
Okay guys, I want to know what you guys think..........

I'm considering having a new motherboard and a new memory modules put in. This way, I don't need to ever second guess anything in case Micro Center is unable to find any problems.

What do you guys think about this motherboard?

http://www.evga.com/products/moreInfo.asp?pn=132-BL-E758-A1&family=Motherboard Family&series=Intel X58 Series Family&sw=5

and this Corsair memory that has more conservative timing(CL8 instead of CL7)

http://www.microcenter.com/single_product_results.phtml?product_id=0311100
 

SazanEyes

[H]ard|DCer of the Month - January 2011
Joined
Jan 14, 2009
Messages
1,250
Both of my i7s have CL9, including one with Corsair XMS3. From what I've heard, bandwidth is more important than CAS latency when overclocking. However, I don't know why higher-rated RAM couldn't run at a lower timing (set manually).
 

amdgamer

Supreme [H]ardness
Joined
Oct 27, 2004
Messages
4,880
Both of my i7s have CL9, including one with Corsair XMS3. From what I've heard, bandwidth is more important than CAS latency when overclocking. However, I don't know why higher-rated RAM couldn't run at a lower timing (set manually).

In general, I tend to be very conservative especially with ram timing. The Corsair XMS CL8 DDR-1600 ram is what is in folding1 and my gaming rig, so I at least know that ram will be good for some good overclocking.

I suppose you could run more aggressive timing at a lower speed, but it is one of those things where i'd rather not risk it.

On a side note, Micro Center still doesn't know what is wrong yet. Getting a tad bit frustrated considering that the system was only 1 week old. I talked with them last night and they were able to get the same memtest error I believe, but they have no idea what is causing it. I have never seen that generic of an error message in my life before.

"Your system may have a serious hardware problem, please contact manufacturer"

No shit Microsoft.............
 

musky

[H]ard|DCer of the Year 2012
Joined
Dec 14, 2009
Messages
3,154
Running CL8 or CL7 memory at CL9 is actually more conservative that running it a stock. I tend to set my systems up at 9-9-9-24-2T initially regardless of memory to find out how high it will go, then tighten up the timing after I have a final stable overclock. The difference between memory timing 8-8-8-20 and 9-9-9-24 is far less than the difference between PC1333 and PC1600.
 

APOLLO

[H]ard|DCer of the Month - March 2009
Joined
Sep 17, 2000
Messages
9,089
On a side note, Micro Center still doesn't know what is wrong yet. Getting a tad bit frustrated considering that the system was only 1 week old. I talked with them last night and they were able to get the same memtest error I believe, but they have no idea what is causing it. I have never seen that generic of an error message in my life before...
I would just tell them to install different modules. I'm sure that will take care of the errors. It could be one or more modules has a problem and this will only worsen with time. If it's not the memory, maybe something is defective on the board. At least you'll know by changing the modules.
 

amdgamer

Supreme [H]ardness
Joined
Oct 27, 2004
Messages
4,880
I got a call from the earlier today while I was at work around 8:30pm. They found the problem, and it was rather serious. They said that one of the ram slots on the motherboard shorted out during diagnostic testing and ended up taking out all 3 modules with it. They replaced the motherboard and ram and said I can pick it up in the morning. Once they did this, they said that they had no problems getting my ram running at 1600mhz.

My biggest concern is that I bought CL7 ram, and Micro Center was out of that type of ram. I hope they didn't decide to just replace it with the CL8 Corsair which they have plenty of in stock.
 

amdgamer

Supreme [H]ardness
Joined
Oct 27, 2004
Messages
4,880
Yikes! Glad they got it fixed, though.

Yeah, and this is exactly why I don't do my own system builds anymore. I have so little time right now, and it would have been a catastrophe if I had to go and start trying to troubleshoot what is wrong.

I figured there was something wrong when I couldn't get the system stable at 1600mhz no matter what when I first got the rig. I feel a little better now that I know it was a bad motherboard. It is scary that a ram slot shorting out can take out all 3 modules though.
 

APOLLO

[H]ard|DCer of the Month - March 2009
Joined
Sep 17, 2000
Messages
9,089
A defective RAM slot featured strongly on my mind. It's a good thing they were able to find the problem and a good thing you sent it back quickly.
 

amdgamer

Supreme [H]ardness
Joined
Oct 27, 2004
Messages
4,880
A defective RAM slot featured strongly on my mind. It's a good thing they were able to find the problem and a good thing you sent it back quickly.

Before the gaming rig you see in my sig, I was still happily gaming on my previous gaming rig which featured an Asus A8N-SLI AGP 8x mobo,4 gigs DDR-400 ram(4x1) eVGA 7800GTX, AMD Athlon 64 X2 4400+, and 74gig WD Raptor HD. What did my previous gaming rig in was when one of the ram slots shorted out and ended up taking out 2 of my 4 memory modules.

It is cruel how I am getting taunted yet again by folding2 doing this to me. Just seems so much like deja vu, happening over again although at least folding2 was still under the 30 day warranty.
 

amdgamer

Supreme [H]ardness
Joined
Oct 27, 2004
Messages
4,880
Well folding2 is back online. However i've gonna have to fiddle with it when I have some time as it is getting terrible TPF's. On a 2682, i'm getting a TPF of nearly 30 minutes with the Gulftown running@3.6ghz and the ram running at DDR-1600 with CL8 timing.
 

APOLLO

[H]ard|DCer of the Month - March 2009
Joined
Sep 17, 2000
Messages
9,089
A quick remedy for the low TPFs would be upping the clock frequency, but I guess you want to get the memory settings down pat first. What did you get done in the end, BTW? New board+memory or just new board?
 

amdgamer

Supreme [H]ardness
Joined
Oct 27, 2004
Messages
4,880
A quick remedy for the low TPFs would be upping the clock frequency, but I guess you want to get the memory settings down pat first. What did you get done in the end, BTW? New board+memory or just new board?

New board plus new memory. They said that ram slot shorted out during diagnostic testing and took out all 3 modules. The good news is that I can finally run the ram at 1600mhz. Before I took it in, I couldn't get the system stable past 1333mhz.

These TPF's are really really bad. Maybe there are some bios settings that got messed up. I manually set the ram at 8-8-8-24 timing, but didn't bother looking at the other stuff under dram in the bios. At 3.6ghz and the memory running at 1333@CL7 previously, I was getting TPF's of 26min on 2685's.

Folding1 is currently getting TPF's of 36min on 2682's and it is a Bloomfield running at 3.46ghz with ram running at 1280mhz or something like that. I'm hoping to have the funds to put a Gulftown in this within in 3 months or less.
 

APOLLO

[H]ard|DCer of the Month - March 2009
Joined
Sep 17, 2000
Messages
9,089
Sorry for the misunderstanding, I meant to say new as in different brand/model for each of your damaged components...

Did you check tn the BIOS for CPU specific settings? I never touched a Nehalem board and don't really know what options there are for that architecture. With my older Xeon server boards there are quite a few settings that you can enable/disable which may or may not affect folding performance. I have no idea what's in your BIOS but you might want to sift through every area. Hope you can improve the performance.
 

amdgamer

Supreme [H]ardness
Joined
Oct 27, 2004
Messages
4,880
Sorry for the misunderstanding, I meant to say new as in different brand/model for each of your damaged components...

Did you check tn the BIOS for CPU specific settings? I never touched a Nehalem board and don't really know what options there are for that architecture. With my older Xeon server boards there are quite a few settings that you can enable/disable which may or may not affect folding performance. I have no idea what's in your BIOS but you might want to sift through every area. Hope you can improve the performance.

Ahh, yeah I misunderstood you. Been at work all day, so a bit tired. I picked up my rig from Micro Center early today, and just now got around to getting it online.

Yeah, they replaced it with the same exact motherboard and supposedly the same ram. I really like Gigabyte and this is the first time I ever encountered a problem. It is the same exact motherboard that is running in folding1 as well, so it kind of makes it easy.

Right now, i'm so insanely busy I can't mess with much. The new business I got a job with is opening up another retail store, so all of us are currently working hard to make sure this goes smoothly. The TPF's are really really bad, but I have no choice but to let it run as is until I can spend some time trouble shooting. I am willing to bet that it is just a simple bios option that I probably overlooked in my haste to bring folding2 back online.
 
Top