GPU upgrade for my Dell Precision T3610 causes reboots

Joined
Jan 3, 2009
Messages
644
I have a Dell Precision T3610. I would like to clarify that this PC uses a proprietary PSU/Motherboard/Case so I can't really replace any of those with standard parts.

My PC came with a 685 Watt PSU, and it originally came with an Intel Xeon 1260v2 and Nvidia Quadro K4000, as well as a 1500RPM SAS drive. I replaced the CPU with a Xeon 2667v2, upgraded the RAM to 8x16GB and replaced the SAS HDD with two SSDs and a 7200RPM HDD.

So far the system was working perfectly fine. But then I tried to upgrade the GPU. I got an HP OEM style RTX 2060 Super, hoping that the lower-profile and lower-power GPU would both be easier to fit in my case and not demand too much power. This kind of card:



I thoroughly tested the card in another machine for about two weeks and had no issues.

However, one problem is that the PSU only has a single proprietary 8-pin connector for GPU power, which by default gets split into two 6-pins, and that GPU I got is a single 8-pin. I wanted to just get an 8-to-8 pin cable, but I could not find one, all of them were 8-pinc to 2x8pin which I do not trust. So I just got a cable to converts my two 6-pins back to a single 8-pin, this one:

https://www.amazon.com/dp/B07V4GGS43

Here it is installed:


One oddity that I noticed was that from the stock OEM cable that connects to my PSU, only 6 of the 8 pins appear to be populated from the part that connects to the PSU:

The whole setup looks like this:


So after I installed the card I ran more tests to make sure my system can handle it. I tried Furmark's stress test, and it ran fine for about 5 minutes. According to my UPS my system was pulling around 300-360 watts.

Then I closed that and tried Prime95, again my system was pulling in the mid-300s according to my UPS.

Then I ran both... to my surprise it seemed to be pulling the same amount of power, a few spikes to 400 watts but that's it.

I walked away for a minute, and when I came back the system had rebooted.

No doubt some kind of current protection had kicked in while I was gone, and I was looking for advice on what to do. I would have assumed that 675 watts would be enough for all this, and I have no idea if it's the PSU at fault. If so there are 800 and 1300 watt PSUs, but I am not sure if they would work for my system. I see a few listing them for the Precision T3600 series and up... but most sites list them for the T5000 series and up. I have no idea if the 800+ watt PSUs are compatible with my system, and I don't want to risk putting in an incompatible PSU... not like I can use a standard PSU.

Or I wondered if it could maybe be the 2x6 to 1x8 pin adapter. Like I mentioned I found it weird that only 6 pins on the OEM cable of all things are populated, and that my system was not pulling more than 400 watts with the CPU and GPU both being stressed while it was pulling in the mid-high 300s with each separate part stressed, so I don't know if it might be either the OEM cable or the 6 to 8 pin adapter I got that is not supplying enough power.

And if that is the case, if that 1x8 to 2x8 adapter might be a better idea, this one here:
https://www.amazon.com/dp/B07P82ZH22

I do not trust that kind of cable because it is splitting a single 8 pin to two, something that seems pretty risky and dangerous, but I also would only be using it to power a single 8-pin and it APPEARS to populate all the pins on the PSU side (though it seems like it just re-routes the ground pins on the GPU-end to other pins?)

Or if this could be something else entirely or I am just pushing my system too hard with all the modifications I made: upgrading the CPU, maxing out the RAM, putting in three drives in place of the one which required a SATA splitter (though that is replacing one 15000 SAS drive with a 7200 and two SSDs on a system designed to handle up to two of those SAS drives so I would assume the SATA power is not being overloaded) and now upgrading the GPU.

Any advice on what could be the issue and how to try to solve it?
 
As an Amazon Associate, HardForum may earn from qualifying purchases.
Well great, it JUST happened again, and this time the RTX2060 isn't even in my system! I swapped it out temporarily for a GT720 until I can figure this out about a week ago. I thought for sure it was the RTX2060 since the day I installed it my system would crash within roughly 24 hours, at most I got it running for 48, many times it wasn't even up for 24. But ever since I swapped to the GT720 it was running fine so I was trying to look into if it was the card, cooling, or power. But just now the same thing happened with the GT720 after being fine for a week. I was going between Chrome windows when suddenly the Window I swapped to was completely black, then my entire display become garbled and the video signal was lost. Trying to reset the video driver with Win+Ctrl+Shift+B did nothing, mashing Ctrl+Alt+Del did nothing, the only difference was that the second I pressed the power button it hard shutdown instead of making me hold it down for 10 seconds.

I heard someone mention that I might have to disable SERR messages or VT-x in my BIOS? Anyone ever heard of having to do that to fix something like this? I use Virtualbox on my system as well, would disabling VT-x impact that?

The event log didn't show much, just made a mention of "The computer has rebooted from a bugcheck" and generated a minidump.

The minidump seems to imply that it's somehow STILL the Nvidia driver.... despite having completely wiped my drivers and re-installed a much older driver that was the last one that supported the GT720:

Microsoft (R) Windows Debugger Version 10.0.22621.755 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [C:\Windows\Minidump\031023-11156-01.dmp]
Mini Kernel Dump File: Only registers and stack trace are available

Symbol search path is: srv*
Executable search path is:
Windows 10 Kernel Version 19041 MP (16 procs) Free x64
Product: WinNt, suite: TerminalServer SingleUserTS
Machine Name:
Kernel base = 0xfffff805`50200000 PsLoadedModuleList = 0xfffff805`50e2a210
Debug session time: Fri Mar 10 13:40:58.599 2023 (UTC - 6:00)
System Uptime: 4 days 20:46:45.393
Loading Kernel Symbols
...............................................................
................................................................
................................................................
....
Loading User Symbols
Loading unloaded module list
.................................
For analysis of this file, run !analyze -v
15: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

VIDEO_TDR_FAILURE (116)
Attempt to reset the display driver and recover from timeout failed.
Arguments:
Arg1: ffffc206961ea010, Optional pointer to internal TDR recovery context (TDR_RECOVERY_CONTEXT).
Arg2: fffff80571da5838, The pointer into responsible device driver module (e.g. owner tag).
Arg3: 0000000000000000, Optional error code (NTSTATUS) of the last failed operation.
Arg4: 000000000000000d, Optional internal context dependent data.

Debugging Details:
------------------

Unable to load image nvlddmkm.sys, Win32 error 0n2
*** WARNING: Unable to verify timestamp for nvlddmkm.sys

KEY_VALUES_STRING: 1

Key : Analysis.CPU.mSec
Value: 2890

Key : Analysis.DebugAnalysisManager
Value: Create

Key : Analysis.Elapsed.mSec
Value: 5844

Key : Analysis.Init.CPU.mSec
Value: 3858

Key : Analysis.Init.Elapsed.mSec
Value: 75133

Key : Analysis.Memory.CommitPeak.Mb
Value: 103


FILE_IN_CAB: 031023-11156-01.dmp

DUMP_FILE_ATTRIBUTES: 0x8
Kernel Generated Triage Dump

BUGCHECK_CODE: 116

BUGCHECK_P1: ffffc206961ea010

BUGCHECK_P2: fffff80571da5838

BUGCHECK_P3: 0

BUGCHECK_P4: d

VIDEO_TDR_CONTEXT: dt dxgkrnl!_TDR_RECOVERY_CONTEXT ffffc206961ea010
Symbol dxgkrnl!_TDR_RECOVERY_CONTEXT not found.

PROCESS_OBJECT: 000000000000000d

BLACKBOXBSD: 1 (!blackboxbsd)


BLACKBOXNTFS: 1 (!blackboxntfs)


BLACKBOXPNP: 1 (!blackboxpnp)


BLACKBOXWINLOGON: 1

CUSTOMER_CRASH_COUNT: 1

PROCESS_NAME: System

STACK_TEXT:
ffffae89`095aa808 fffff805`6769555e : 00000000`00000116 ffffc206`961ea010 fffff805`71da5838 00000000`00000000 : nt!KeBugCheckEx
ffffae89`095aa810 fffff805`67694bc1 : fffff805`71da5838 ffffc206`961ea010 ffffae89`095aa919 00000000`00000000 : dxgkrnl!TdrBugcheckOnTimeout+0xfe
ffffae89`095aa850 fffff805`67abd483 : ffffc206`961ea010 00000000`00989680 ffffae89`095aaa30 00000000`019a8d58 : dxgkrnl!TdrIsRecoveryRequired+0x1b1
ffffae89`095aa880 fffff805`67b1b2fb : ffffc206`8b491000 00000000`00000001 ffffc206`8b491000 00000000`00000000 : dxgmms2!VidSchiReportHwHang+0x62f
ffffae89`095aa980 fffff805`67ae8142 : ffffae89`095aaa01 00000000`019a8cd7 00000000`00989680 00000000`00000040 : dxgmms2!VidSchiCheckHwProgress+0x3318b
ffffae89`095aa9f0 fffff805`67a8a11a : 00000000`00000000 ffffc206`8b491000 ffffae89`095aab19 ffffc206`8b491000 : dxgmms2!VidSchiWaitForSchedulerEvents+0x372
ffffae89`095aaac0 fffff805`67b0d405 : ffffc206`8f3f8000 ffffc206`8b491000 ffffc206`8f3f8010 ffffc206`8b541620 : dxgmms2!VidSchiScheduleCommandToRun+0x2ca
ffffae89`095aab80 fffff805`67b0d3ba : ffffc206`8b491400 fffff805`67b0d2f0 ffffc206`8b491000 fffff805`4d64b100 : dxgmms2!VidSchiRun_PriorityTable+0x35
ffffae89`095aabd0 fffff805`50455485 : ffffc206`89929080 fffff805`00000001 ffffc206`8b491000 00078425`bd9bbfff : dxgmms2!VidSchiWorkerThread+0xca
ffffae89`095aac10 fffff805`50602cc8 : fffff805`4d64b180 ffffc206`89929080 fffff805`50455430 00000000`00000000 : nt!PspSystemThreadStartup+0x55
ffffae89`095aac60 00000000`00000000 : ffffae89`095ab000 ffffae89`095a5000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x28


SYMBOL_NAME: nvlddmkm+dd5838

MODULE_NAME: nvlddmkm

IMAGE_NAME: nvlddmkm.sys

STACK_COMMAND: .cxr; .ecxr ; kb

FAILURE_BUCKET_ID: 0x116_IMAGE_nvlddmkm.sys

OSPLATFORM_TYPE: x64

OSNAME: Windows 10

FAILURE_ID_HASH: {c89bfe8c-ed39-f658-ef27-f2898997fdbd}

Followup: MachineOwner
---------
 
Clean windows install on the new gpu? Sometimes old drivers play havoc with newer GPUs, and not all "clean" installs get it done. Just a thought.
 
The old GPU was a Quadro, the drivers would not even work on the Geforce class of GPUs, but I completely cleaned them out anyway. Never had issues with wiping drivers for GPU upgrades before.

But yes, I used DDU in safe mode, and when I reinstalled the drivers I did it with the ethernet plug disconnected and choosing clean mode. I even wiped my shader cache just to be safe.

I am starting to suspect it's some other hardware fault that is causing the GPU to crash.
 
Back
Top