cooling GPU VRMs

THRESHIN

2[H]4U
Joined
Sep 29, 2002
Messages
3,622
Card in question is a Asus 970GTX Strix. The stock fan cooler is removed, cooling the GPU with an old swiftech MCW60. It's a dinosaur of a water block, but it does the job. Highest temp I saw overclocked was 50C.

However, this card doesn't overclock worth beans. Memory is nothing and if I want stability it means less than a 100Mhz boost on the core with voltage increase. Thing is I don't get artifacting....it just shits the bed and gives me a blank screen.

So it dawned on me the other day. The VRMs only have a small heatsink on them that came stock and without the stock cooler NO fan.

Think this is my issue? Any suggestions to cooling them if it could be?
 
Is this the card in question?
https://www.techpowerup.com/gpu-specs/asus-strix-gtx-970-directcu-ii-oc.b3049
Some VRM designs have thermal protection built into them, where the VRM will totally shut down if it gets too hot. Usually, it's the ones that use power stages that have this, as opposed to discrete FETs like that card, but it's not unheard of for some cards to have a discrete sensor on the board nearby. The Radeon 290X is like this, for instance.

I suggest you aim a fan at the VRM and see if that makes a difference. One thing to note is that on some designs, the FET gates are driven using pretty much straight 12V from the power supply, and if your power supply doesn't supply twelve actual volts, the FETs will run very hot. If the voltage drops low enough, down into the mid 11s in my experience, it can even become effectively uncoolable, sometimes leading to damage. Thus, I would also suggest that you check the supply voltage to the PCI-E connector on the card while it's running a game and see what kind of voltage you're actually getting.
 
i have a few pics i can upload once get a chance spoiler i used thermo grizzly and Copper Shimson ever vram chip and the arctic extreme 5 aftermarket cooler, it was hilariously chill, if the sensors are to be believed the core was 45C under load and it was one i had to do the "Bake" to fix artifacting.
 
Last edited:
Last night I did what I should have done in the first place. Dug out my IR temperature gun. Played control for a bit to heat it up and got 95C on the base of the heatsink....I think I found the problem. Now to figure out a good way to cool these things. Might try those shims and stick a fan on it or something.
 
Last night I did what I should have done in the first place. Dug out my IR temperature gun. Played control for a bit to heat it up and got 95C on the base of the heatsink....I think I found the problem. Now to figure out a good way to cool these things. Might try those shims and stick a fan on it or something.
You said you're water cooling the die - maybe look on ebay for a used full coverage water jacket?

Also, if you have a multimeter, check the actual supply voltage to the card. Hot-running FETs can be a symptom of a weak power supply.
 
You said you're water cooling the die - maybe look on ebay for a used full coverage water jacket?

Also, if you have a multimeter, check the actual supply voltage to the card. Hot-running FETs can be a symptom of a weak power supply.

Where would I check the voltage? I have a couple multimeters actually. I do a lot of car repair and, well repair anything actually. I'm just not sure where to measure.

The issue I constantly run into is most of my knowledge is from years ago...like voodoo3 days. Even up to the first dual cores. After that I never kept up on the details.

I think one thing I'll do today is try to improve the vrm cooling. The heatsink is getting removed, junk thermal pad cleaned off and put it right onto the VRMs with some MX-4. I'll probably have to remove the junk plastic anchors and replace with some screws. I'm also thinking adapt a 120mm fan to blow right on it.
 
The easiest place to measure is the solder joint where the power connectors attach to the card. In each connector, there are three pins, in the row closest to the latch, that should have 12.0 volts. On power supplies with the old skool "ketchup and mustard" cables, they're the pins with yellow wires attached to them.

What you're looking for is 12.0V at idle and maybe like 11.8 under load. If you see less than that, then the power supply is suspect, especially with a card as efficient as a 970. Thirstier cards will show a greater drop under load.

Chances are, your problem is just that you don't have enough cooling on the FETs, but I once struggled for days with a situation where my high side FET kept burning out on my GTX 690s, and it was only after it killed two or three of them that I figured out what was going on.
 
Is this the card in question?
https://www.techpowerup.com/gpu-specs/asus-strix-gtx-970-directcu-ii-oc.b3049
Some VRM designs have thermal protection built into them, where the VRM will totally shut down if it gets too hot. Usually, it's the ones that use power stages that have this, as opposed to discrete FETs like that card, but it's not unheard of for some cards to have a discrete sensor on the board nearby. The Radeon 290X is like this, for instance.

I suggest you aim a fan at the VRM and see if that makes a difference. One thing to note is that on some designs, the FET gates are driven using pretty much straight 12V from the power supply, and if your power supply doesn't supply twelve actual volts, the FETs will run very hot. If the voltage drops low enough, down into the mid 11s in my experience, it can even become effectively uncoolable, sometimes leading to damage. Thus, I would also suggest that you check the supply voltage to the PCI-E connector on the card while it's running a game and see what kind of voltage you're actually getting.

And yeah sorry for the delayed response, that is the exact card
 
Youve got the right idea. The easiest quick fix is to zip tie a 120mm fan over the back of the card. You can get some copper or aluminum heatsinks for the fets but chances are the fan alone will keep them plenty cool enough.
I used a mcw60 on a 580 and 980ti with a 120mm fan. You can make it look pretty good if you mount the 120 carefully. I was able to run the same max ocs with this as i did with ek fullcoverage blocks.
 
So I just finished rigging up an old 120mm fan salvaged from an old system in the closet. Can't easily measure voltage since the solder contacts are behind a shroud. I'll see what happens with the extra cooling before I tackle that one.

First thing I noted was the screws on the heatsink were very loose. I don't have shims on hand, so I had to stick with the ghetto thermal pad. There's some resistors under there that sit slightly higher than the chips.

Played a bit of control again, and remember how I measured 95C last night? Well with the fan it's 42C measured at the same point. Not bad!

I'll give overclocking a try again and see what happens. Thanks for the help guys. Great as always.
 
Thought I'd report back with some preliminary results. Last night I cranked the 970 up to 1500Mhz, boosted the voltage to about 1.25. previously, there's no way it would be stable at these settings. I'd be lucky to get 10 minutes before the video card shut down.

Last night I played control for an hour and a half. It did finally crash, but only the game crashed to the desktop. No artifacts. I can't necessarily attribute that to the overclock. Could have been a game bug for all I know.

So I'd call this very conclusive that my issue was the VRMs were getting overheated and then shutting down. The GPU core is actually running 5C cooler under load now so there's plenty of headroom. Not sure if that is because of adding a fan or that it's getting proper power flow now. I suspect power since the fan isn't going to do much blowing on a water block.

Guess it's time to find my limit. Maybe boost the voltage a little more and see where that takes me. With the GPU loading at around 45C, I think it's got plenty of room.

Thanks for the help guys. Although I strongly suspected this, you guys were a big help along the way.

This is also going to change how I cool my cards in the future. The MCW60 is a GPU only cooling solution since old cards didn't require more than that. Obviously that is no longer the case. Funny thing is that since the G80, Nvidia hasn't changed the mount holes so I can still use this thing :p
 
Back
Top