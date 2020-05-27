Strange, intermittent VGA failures - 1080ti

I'm just checking in to see if anyone else has experienced anything like this.

I have a Thermaltake Core P5 with a vertical GPU mount, seen below:

IMG_20190830_124539024.jpg

I recently replaced a few of the hard tubed lines with ZMT, as I needed the 14mm fittings for the build's next iteration. One of the runs replaced was the connection between CPU and GPU, and that 3/8" x 5/8" tubing has got some spring to it in a bend that tight. It's kinda pressing against the GPU block terminal, forcing it toward the viewer.

Now, the machine only boots one in ten tries. The other nine, it fails on VGA, as indicated by the status LEDs and beep codes.

It seems like if I grab the GPU waterblock and pinch it by hand while booting, the machine will start. That's really bizarre though, so before I drain it and rework it, I wanted to ask: has anyone else seen any issue like this before?
 
The strain may have caused the video card to make imperfect contact on the PCI-E pins in the extender. I would definitely rework the loop to allow a better connection to the PCI-E extender. It's never good to have any kind of strain on component connections.
 
Zedicus

ZMT can be softened with a (good) hair dryer on the high heat setting. it will form enough to take some of the spring out of it. you should be able to do this with out draining the loop. put some extra pressure on the card while doing this to get a bit more form on the tubing. (a cautious person with a heat gun would be fine as well)
 
I reconfigured the machine to a more traditional state with the GPU mounted directly in the PCI slot. No video and beep codes still indicate VGA failure. So I grabbed the passively cooled GT710 I keep on hand for situations like this and plug it into my second PCI slot. Switched monitor inputs (leaving the DP cable plugged into the 1080ti) and hit the button.

The machine boots and I load the BIOS - I'm still getting error LEDs and beep codes indicating VGA failure, but in the BIOS, my 1080ti is properly identified and I'm getting temperature readings from it.

Unrelated, but I'm recovering the system from a malware infection right now (be CAREFUL when you snag VLC media player) and so Windows boots into safe mode. In safe mode, both display adapter drivers are running and reporting no problems.

I try booting in regular mode and the OS refuses to load fully - I get a black screen with a mouse cursor, but no windows interface and no HDD activity. So I grab my USB stick and reinstall Windows.

I install while unplugged from ethernet and so both the display adapters present are running on generic Microsoft display drivers - and one of them is reporting a problem. I don't know which one it is, but when I switch inputs on my monitor to the 1080ti, I get no signal. I switch back to the 710 and plug in the ethernet cable to allow Windows to snag drivers for the cards.

It does so (and goes through it's cycle of reboots for applying updates), and I pull up device manager again - both cards are reported and identified, neither are reporting problems. I switch to the 1080ti input on my monitor and it's WORKING. I'm typing this on it right now.

I've got my fingers crossed while I download a video benchmark to test stability. I do NOT want to have to replace my GPU right now. Have you ever heard of malware screwing a video driver so hard that it has problems like this?
 
Is there any chance that coolant from your water cooling setup leaked on it? I've never heard of malware that could cause a failure in this manner - leakage from the cooling system seems vastly more likely, particularly if you're using hard tubes.

The behavior you're seeing where it works with the compatibility driver, but not the real one is a pretty common indicator of a failure of the logic portion of the graphics card. It's not uncommon for some cards to start working again once they heat up after a few minutes with power applied.

Have you checked for corrosion on the back of the card? If that's the problem, you may be able to just clean it off and get it to work again.
 
