Sudden BSOD issues

Wheresatom

[H]ard|Gawd
Joined
Mar 20, 2007
Messages
1,390
In the last couple weeks I have had my computer BSOD 4-5 times I believe. I had not changed any settings on the computer. My CPU HSF is clean, and I have been paying better attention to the temperatures and am not seeing anything out of the question. Max temperatures have been around 40-45c. After the first one, in an attempt to treat the symptoms rather than cure the illness I reduced my OC from 3.2 to 3.0. It has BSODed a couple times at that, and even last night gave errors in the SMP at about the 3rd step of whatever work unit it was.

Anyway, so basically, I think I have fried something. A few weeks ago when I went to put my USB drive into the computer, a spark went from the usb to the computer (or vice versa) and the computer turned off. No shut down sequence just off. This happened twice I think over the course of a couple days. Since then I have been a bit better at topping off the humidifier and there is less static, but the damage is done. I assume these shocks have hurt some component in my computer.

The question to me is, which component? I am not very experienced with this type of thing so I don't know what breaks when this happens. Is this going to be a big trial and error situation where I just have to swap out parts till it works again or is it most likely one component?

I hope this isn't expensive.

 
Depending where the spark went to:

USB port:
If the spark went to the USB port it would be the USB controller chip on the motherboard.
You can trouble shoot this by turning off USB support in the Bios and see if you still get BSOD's.
If during this period of testing if you need to have USB for you keyboard, mouse, etc.. you can always buy a cheap USB PCI controller card and use that as a temp controller while testing the on board USB chip.
In my opinion it would take a pretty large electrical charge to take out the on board USB chip since they are built with pretty wide tolerances, but it is possible.

General computer case:
Swap out memory and see if BSOD's goes away.
 
Damn, I am not entirely sure where the shock went to. I have an Ubuntu live CD laying around somewhere. Will the memtest on that work for testing? If so, how do you use it? Is it like a stress test and you just run it until you feel confident or does it have a 15 minute test it runs or something?
 
Damn, I am not entirely sure where the shock went to. I have an Ubuntu live CD laying around somewhere. Will the memtest on that work for testing? If so, how do you use it? Is it like a stress test and you just run it until you feel confident or does it have a 15 minute test it runs or something?
I would use memtest86+ cd. http://www.memtest.org/
It should point out if the memory has been hurt.
Boot to the CD and run a full test on the memory for at least an hour or so to get them and the north bridge up to temp once the inside case temp gets to it highest point.
 
I run the stress tests in the order .........
Memory-> Memtest86+. Run 2-3 loops error free.
CPU -> Orthos, Prime, etc. Run 2-24 hours error free.
Video -> FurMark, ATITools, etc. Run untill the temp is stable.
Hard Drive -> Run test off the hard drive makers home page.

If everything tests error free and the temps/volts are in the correct range then its time to start thinking software error.

Also what stop code is being generated by the BSOD ??
To see the stop code, turn off auto-restart in System Properties -> Advanced -> Startup and Recovery.
If its the same one everytime then look it up.

Luck .............. :D
 
I'll throw in there that when I started having the same problem, it was my power supply dying - so if all of Tiger's tests come back negative, that's another possibility. The straw that broke my PSU's back was the added stress of some of the new F@H WUs - the boxen was running fine for a few months, then all of a sudden, within a few days, I lose about 1500 ppd and start getting BSOD/shut downs for no apparent reason...
 
hopefully you dont have the same bug my sisters laptop got during one the vista updates.. ever since the update her laptops been bsod'ing left and right.. though its vista 32bit not 64bit.. though ive had the same thing happen with the spark on the usb.. but all it did for me was kill the usb controller so i couldnt use the USB ports on the back of the computer.. had to use the USB ports on the front that were connected to the usb headers..
 
Ok, so I ran memtest without any issues. I started my computer and it is running the SMP client right now using the unit it last downloaded. I stopped the GPU client and the other software that installed that runs as a service (Playon by Media Mall) in the last few weeks. In Vista you can go back through and check for info about Blue Screen issues after it screws up. I had a nice mix of BCcode 116 &117 in the last month or so. A little googling finds out that is a video hardware issue. I don't know what the cure for it would be though. Does anyone have any experience with this?

We will see in another 24 hours hopefully that my CPU is stable as long as GPU isn't running...
 
Do you have your PCIE Frequency overclocked in your bios? Default 100 MHz
 
Ok, so things were still fine this morning. I will let it complete this work unit before I rule out the CPU overclock. Everything I see points to Video problems though, which makes sense due to my problems with GPU2 folding.

I don't know what the next step is going to be though. I guess I can start a game of pin the driver to the system and just see if I can find a driver that doesn't hurt me. Or, is this a hardware problem? Is there a good way to stress test the power supply (one suggested problem)? Another solution would be to RMA my video card, but I don't want to do that if it isn't going to help. My searching for this BCcode 116/117 always just leads me to a crap shoot sorta. I have yet to find a definitive answer. The problems seemed to be more prevalent 2 years ago when Vista was newer though. I guess that points to drivers.

Sorry for rambling. I just don't know what my next course of action should be.
 
Ok, I hate to keep bumping this, but I really am interested in finding the problem. I go out of town on the weekends enough that I would like to not wonder if this is working.

Anyway, with CPU alone, the work unit finished completely and I started into further testing. For about 2 hours last night I ran Furmark and Orthos without any issues. The GPU temperature got as high as 80 and just kinda stayed at 79 for most of the time.

Do I need to run like that for 24 hours or so to consider myself stable? Or should I try to run orthos with GPU2 FAH to see if I can make it crash again? Or should I just deal with the problem until I can re-create it more often?
 
Try running ATItool and 3Dmark to stress the GPU, this will also be a good way to make sure the powersupply is giving the correct amount of power
 
Back
Top