670 GTX Power issue during Litecoin mining

Zyklon808 · Jan 17, 2014

I have two MSI Power Edition OC powered by a Silverstone 1000w PSU. The cards are liquid cooled and never go above 45-50c.
I started to mine Litecoins about 2 weeks ago.
Now if I mine for longer than 3 to 4 hours the computer just reboots. It looks like a power off but then fires back up. When I rerun my mining app Cudaminer, it only sees one video card. Also the Nvidia control panel is gone.
I just re-install the driver on top of itself and it goes back to normal.
This does NOT happen when I play Bf4 or other games for 3-4 hours. Just mining.
The cards have an EK Universal on the gpu and the original heatplate covering ram and some of the vrms. I have no way of telling what temps the VRM are at other than knowing that there is direct air on them comparable to the TwinFrozr IV it had on prior.

Now here is the question, is the second GPU going bad or is my PSU going? I've tested the PSU and it's all clear. Also I've had more than one SLI setup on this PSU.
How would I test just that one #2 card without tearing the whole loop apart?

cortexodus · Jan 17, 2014

Just a guess, but maybe you're running into issues with voltage through the mainboard. What's your mobo?

Zyklon808 · Jan 17, 2014

Update: I JUST saw it happen in front of me. Every other time it did it while I was away from home.
I got ye olde IRQ_NOT_LESS_OR_EQUAL on the bottom of my pretty Windows 8 blue screen.
I'm using a Maximus Gene V with 3750K no over clocking.

af22 · Jan 17, 2014

litecoin mining is heavily dependent on the memory of the gpu. what are the temps of your memory and vrm. make sure you have air flow over the cards.

also, what's the purpose of mining with a GTX 670? your looking at nearly no profit especially after the fees of cashing out.

Zyklon808 · Jan 18, 2014

Free electricity. With all my cards (2 in one pc and 1 in another) I get around 1 M/hsh.

I do not know the temps as they are not shown on GPU-Z or MSI Afterburner. Just gpu temps.
I do not have a fancy infrared laser guided thermometer to just point at components and check.
Here is a picture of what it looks like without TwinFrzr:

500x1000px-LL-741ef529_MSI-N670GTX-PE2GD5OC-GeForce-GTX-670-2GB-GDDR5-28V284-020R29.jpeg

Those components out in the open are cooled with equal or more air.

BallaTheFeared · Jan 18, 2014

Zyklon808 said:
Update: I JUST saw it happen in front of me. Every other time it did it while I was away from home.
I got ye olde IRQ_NOT_LESS_OR_EQUAL on the bottom of my pretty Windows 8 blue screen.
I'm using a Maximus Gene V with 3750K no over clocking.

Pretty sure that is a CPU error, download this and get the error code.

http://www.nirsoft.net/utils/blue_screen_view.html

Ultima99 · Jan 18, 2014

Furmark

Dreamerbydesign · Jan 18, 2014

Zyklon808 said:
Update: I JUST saw it happen in front of me. Every other time it did it while I was away from home.
I got ye olde IRQ_NOT_LESS_OR_EQUAL on the bottom of my pretty Windows 8 blue screen.
I'm using a Maximus Gene V with 3750K no over clocking.

This is generally tied to an issue with the RAM not the gpu

Zyklon808 · Jan 18, 2014

Ok, I downloaded the app. Here is a screen shot:

HOODedDutchman · Jan 18, 2014

It's friggin windows 8. Blue screens tell you bullshit nothing. Thinking about a dual boot with 7 just for overclocking. I know most traditional blue screen by heart and makes so much easier for overclocking. I find win8 says same thing. I've had the same code while clocking cpu and ram. While clocking cpu it gave me same code while messing with v core, mem voltage, and vtt. Garbage leave it to Microsoft to change something for no reason. Would be much easier to pinpoint with traditional blue screen.

my rant of the day. Wish I could help.

edit: seriously there's an app that gives traditional codes. I'm a chump. Searched high and low few months ago. Thanks !

Zyklon808 · Jan 18, 2014

Well the blue screen is for nubs anyways. I've used Event Viewer for the last decade or so. This crashview program is great! Windows 7 is just as esoteric, just easier to see.

I'm updating the lan drivers ndisys error.

Also as stated in the op, I am not over clocking.

HOODedDutchman · Jan 18, 2014

What cpu r u running ? 0x0a bosd is either qpi/vtt voltage, ram/imc, or could even be v core. Very unlikely it points to video cards tho. If your not overclocking it sounds like ram to me. Could be your cpu has a weak mem controller and needs a bump to qpi/vtt for stability. Ram is finicky like this. Could be when it's barely being used that a problem pops up. Need to know platform cpu and ram to help any further tho. Had a few cpus that needed a bump from 1.05v to 1.1-1.15v for higher speeds ram. Even at stock clocks.

Zyklon808 · Jan 18, 2014

I'm using a Maximus Gene V with 3750K no over clocking.
Stock settings.

HOODedDutchman · Jan 18, 2014

Zyklon808 said:
I'm using a Maximus Gene V with 3750K no over clocking.
Stock settings.

How many stick of ram and what speed and timings ?

edit: alright I'm going to bed. Go into bios and bump vccio to 1.15v should solve your issue. 1.1v should be fine but if it's not I don't want you to decide that's not the problem. Try 1.15v if it solves your problem then try 1.1v and see if it still works. 1.05v is stock I believe. Don't exceed 1.2v as it will damage cpu. Check and make sure memory is set to proper voltage as well while ur in there.

Zyklon808 · Jan 18, 2014

Ok, I've done that and put the ram down to 1333 from 1600.
I've got stick of this ram in slots 1 and 3.

Zyklon808 · Jan 18, 2014

Was mining when I found that one of the cards just stopped working, the second card does not do any mining and SLI does not re-enable. I have to re-install driver once again.

HOODedDutchman · Jan 18, 2014

Very weird. I suppose no way you can switch the cards. That bsod code should have nothing to do with GPU. Try testing each stick of ram separately. Could also just be the program/windows. If you have no other issues whatsoever it may just be that. Could run a stretch of heaven (couple hours) with sli enabled and see what happens then run a couple hours of mem test and see if errors pop up.

xorbe · Jan 18, 2014

What PSU is powering all this hardware?

cortexodus · Jan 18, 2014

xorbe said:
What PSU is powering all this hardware?

OP said:

Zyklon808 said:
I have two MSI Power Edition OC powered by a Silverstone 1000w PSU.

xorbe · Jan 18, 2014

I wonder if the VRMs on the gfx cards are overheating due to gpu watercooling?

Zyklon808 · Jan 18, 2014

I would not think so as the vrms were always open to the air. Also there is air on it.

Reformatted last night, told it to mine and it reset again.

Will memtest expose flaws without having to run for 4 hours?

cortexodus · Jan 18, 2014

You could try running the miner for awhile on each card individually and see if the issue is specific to one of the cards.

Zyklon808 · Jan 18, 2014

I'm using cudaminer with 2 threads, one for each card. Instead of using my normal switches, I set cudaminer to do auto. So far no crash.
I did 30 minutes of Furmark and no crash.

xorbe · Jan 19, 2014

Zyklon808 said:
Will memtest expose flaws without having to run for 4 hours?

Memtest is single threaded weak sauce. Try Prime95 on blend overnight to test the cpu and memory

670 GTX Power issue during Litecoin mining

Gawd

[H]ard|Gawd

Gawd

Gawd

Gawd

n00b

Supreme [H]ardness

Supreme [H]ardness

Gawd

Gawd

Gawd

Gawd

Gawd

Gawd

Gawd

Gawd

Gawd

Supreme [H]ardness

[H]ard|Gawd

Supreme [H]ardness

Gawd

[H]ard|Gawd

Gawd

Supreme [H]ardness