First, do not "reflow" this card, assuming that by "reflow" you mean put it in a domestic oven. That's a good way to turn a working card into a dead one. The worst case scenario here is that you graft some case fans onto this thing.
What I would do is get out an ohm meter and start probing...
The [H]ard way, obviously.
The cleanest way to deal with this would be to desolder the connector. It'd be pretty easy to do with a hot air station, but you could probably use a regular soldering iron and coax it off of there. Once you hand the mangled connector off, for bonus points, you...
This sounds to me like a BGA failure, or damage to the board.
It's also possible that it's something like a cracked solder joint on some other component, but if it's intermittent, or it works if you press on certain components, it's probably something mechanical that's failed.
Are any of the fan blades dirty? It's probably a combination of worn bearings and an imbalance in the fan blades. You can try cleaning them, but the real fix is probably to replace the fan harness with a new one.
What was the result of removing the low side FET? Did that clear your short to ground?
Removing the coils might help. That effectively breaks the circuit in half - you've got the VRM half on one side, and the logic side on the other. You can then check each side for a short separately, and...
I'd imagine resistance on the coils should be around 30 ohms, but I can't remember ever having measured a 500 series card. I've got one at home I can check later, if you need me to.
What test did you do? Just measuring the resistance to ground on the coils?
The software tests such as using TSserver are more relevant for a card that sorta-kinda works.
Yes. BGA = ball grid array. This is the type of solder technology that's used to attach the GPU and memory chips, with a grid of tiny solder balls on the underside. It's pretty common in older electronics for the solder joints to crack, or pull the pads off of the board, and the behavior...
Probably a BGA failure of some kind, but another possibility is something like a faulty bootstrap capacitor which is causing one of the phases to run super hot.
You'd check for that with a thermal camera, freeze spray, etc. If you've got one phase that's way hotter than the others...
I really meant just the low side FETs, which are the ones that stand between the switch node and ground in the buck converter (you should really read about how a buck converter works before you do anything else).
Each "phase" in the VRM has two transistors - a high side, which switches the 12V...
In this situation, that's likely to make things worse, and not better. Hardgfg's card has experienced a failure in the memory VRM that's created a fairly high resistance short to ground. This isn't an issue of the BGA solder joints failing - something else on the board has most likely melted...
Generally yes, but it doesn't always work, especially when the short is still fairly high in resistance, as yours is, and thus doesn't flow much current. The best tools to locate the warm spots on the board are, in this order:
1. a thermal camera
2. freeze spray
3. isopropanol
4. your lips
5...
How are you supplying the voltage? A bench power supply should tell you how much current it's applying.
Did you try the isopropanol technique I mentioned? Watching the isopropanol evaporating is probably the best way, other than a thermal camera or freeze spray. Using your fingers won't...
Try putting some isopropanol on some of the suspect components, such as the FETs and the GPU die, and then turn the power supply on and leave it on for a couple of minutes, while you watch to see where the alcohol evaporates the fastest.
If you still don't see anything, start checking other...