6Mhz Overclock Makes System Unresponsive

Epos7

Gawd
Joined
Aug 31, 2015
Messages
892
I've had an RTX 2080 for a couple years. Previously I was able to apply a modest overclock without issue, however I had been running the card stock for some time.

Now I'm finding that with either MSI Afterburner or EVGA Precision X1, I can launch the program without issue. As soon as I apply an OC, even something as small as a 6Mhz core increase, the software and system become almost unresponsive, requiring a restart. I used DDU yesterday to try and make sure there aren't any driver issues.

Not quite sure what to make of it. I'm running the latest NVIDIA drivers. One thing I noticed is that the power limit adjustment is greyed out in both pieces of software. I thought that had been an option before.

Recently I've been playing Control, and I'm seeing much lower frame rates than I would expect given my system and what I'm seeing from benchmarking sites. That's what prompted me to try and overclock the card a little bit.

Ryzen 3900X
2 x 16GB DDR4-3000 @ DDR4-3400
Corsair SF600PSU
 
Even clicking the reset to defaults button in Afterburner kills the system. Everything becomes painfully laggy until I restart.

The graphics card gets really warm after applying any settings in Afterburner, while the system is stuttering. It's odd because it doesn't get over 62C in games. It's as if Afterburner and Precision X1 are applying the settings incorrectly and causing it to crash.
 
ddu again and install the last driver version, not the new one. there are several people here having various issue with it.
 
ddu again and install the last driver version, not the new one. there are several people here having various issue with it.

Thanks. Tried rolling back from 445.87 to 445.75, but no difference. Also odd was that after installing the new drivers, the system stuttered badly until I rebooted.

I installed an Accelero III on the card about six months ago. I'm wondering if I damaged it, or using non-stock fans on the GPU fan header is causing issues. Haven't really noticed any problems until recently, though.
 
Thanks. Tried rolling back from 445.87 to 445.75, but no difference. Also odd was that after installing the new drivers, the system stuttered badly until I rebooted.

I installed an Accelero III on the card about six months ago. I'm wondering if I damaged it, or using non-stock fans on the GPU fan header is causing issues. Haven't really noticed any problems until recently, though.
did you remove AB/x1 then ddu? didnt realize that .87 came out, the issue were with .75. maybe try one more back then. what are your temps/volts like, use gpuz see if it works.
 
did you remove AB/x1 then ddu? didnt realize that .87 came out, the issue were with .75. maybe try one more back then. what are your temps/volts like, use gpuz see if it works.

Didn't remove those tools before DDU, will keep that in mind.

I updated my X570 I Aurus mobo BIOS from F10 to F12e. Tentatively that seems to have cleared up the issue. The BIOS update reset my RAM to 2166, so it seems likely the memory speed/timings were the issue. I'll push them back up and retest.
 
Working at the XMP DDR4-3000 profile, so my memory overclock looks like the culprit.

I don't have the option to adjust the power limit slider still, not quite sure how I can get that back.
 
I played a few minutes of Control with GPU-Z open, and it looks like my GPU max was 1510Mhz and memory max was 1750Mhz. GPU voltage peaked at 0.775V, which seems low. It looks like has been stuck with a performance cap due to 'idle' though not sure if that is reporting correctly.
 
Ran Superposition at 1080p Extreme and got a score of 5742 where a typical score for a 2080 is ~7200. Something is up.
 
Sounds like it could be a power draw issue. I'd physically take the GPU out, reconnect it and reconnect the gpu power from the psu. While it's out, clean out the fan to make sure it's not just dust buildup.
If you have a modular psu, try using a different power cable.

How are the temps?

Weird things happen sometimes.

On my end, I'm getting some texture flickering in Monster Hunter World since the new driver that was never there before. So it could be a driver issue.
 
Sounds like it could be a power draw issue. I'd physically take the GPU out, reconnect it and reconnect the gpu power from the psu. While it's out, clean out the fan to make sure it's not just dust buildup.
If you have a modular psu, try using a different power cable.

How are the temps?

I plugged the PCIE power cables into a PSU tester (Dr. Power II) and they check out OK.

Temps are pretty good - typically around 58-60C gaming.
 
I'd hate to say it, but didn't some of the original 2080's in 2018 die in that fashion? Basically, boost clock was broken and instant crashes?

My video card seemingly would lock up in some games with a modest OC or very little OC at all after awhile of working fine at +120 on the core. It turns out in the end for me it was my RAM acting up and requiring higher CPU IO Voltage (had to run it at 1.12V from 1.05V for my speed). You may want to check that out! RAM can cause some strange problems! Although here, I suspect something else probably if OCX1 is acting weird. Can't say I ever had that problem when it was my RAM issue...
 
I'd hate to say it, but didn't some of the original 2080's in 2018 die in that fashion? Basically, boost clock was broken and instant crashes?

My video card seemingly would lock up in some games with a modest OC or very little OC at all after awhile of working fine at +120 on the core. It turns out in the end for me it was my RAM acting up and requiring higher CPU IO Voltage (had to run it at 1.12V from 1.05V for my speed). You may want to check that out! RAM can cause some strange problems! Although here, I suspect something else probably if OCX1 is acting weird. Can't say I ever had that problem when it was my RAM issue...

Thanks! I am suspicious of the RAM as it was the cause of the initial problem I was seeing (instability after applying an OC). I've been tempted to get a nicer kit, so this may be a good excuse to do so.

This also seems like a very similar issue:

https://www.nvidia.com/en-us/geforc...ot-boosting-power-limit-locked-driver-issue-/

Sounds like it's possibly a Windows issue.
 
Saw some posts from EVGA saying Gamers Nexus was able to resolve their issue with a firmware reflash. Tried reflashing the 2080 firmware but no dice. I also tried flashing it with the XC Ultra firmware just to see if that might reset something, but no change. Still haven't been able to dig up the Gamers Nexus videos/articles on their investigation into the issue.

Ordered a new memory kit. I'm never going to be able to stop being suspicious of this RAM after it caused some of the initial issues.

Also emailed EVGA to see what they say. I likely voided my warranty installing the Accelero III (something I was fully aware of at the time) but maybe they have some ideas.
 
I played a few minutes of Control with GPU-Z open, and it looks like my GPU max was 1510Mhz and memory max was 1750Mhz. GPU voltage peaked at 0.775V, which seems low. It looks like has been stuck with a performance cap due to 'idle' though not sure if that is reporting correctly.
Early production runs of the 2080 from EVGA had an issue where power consumption would be locked to idle no matter what you tried. I don't have the links on me now, but I'm pretty sure someone posted a thread about it in the NVIDIA subforum.
Saw some posts from EVGA saying Gamers Nexus was able to resolve their issue with a firmware reflash. Tried reflashing the 2080 firmware but no dice. I also tried flashing it with the XC Ultra firmware just to see if that might reset something, but no change. Still haven't been able to dig up the Gamers Nexus videos/articles on their investigation into the issue.

Ordered a new memory kit. I'm never going to be able to stop being suspicious of this RAM after it caused some of the initial issues.

Also emailed EVGA to see what they say. I likely voided my warranty installing the Accelero III (something I was fully aware of at the time) but maybe they have some ideas.
That's odd. EVGA will honor your warranty if you reinstall the original cooler before sending it in, but they can still deny it if they determine that there is any physical damage that was caused by you.
 
That's odd. EVGA will honor your warranty if you reinstall the original cooler before sending it in, but they can still deny it if they determine that there is any physical damage that was caused by you.

Oh interesting, wasn't aware of that. I kept the stock cooler so I could put things back together if needed.
 
I had similar problems on my 2080 Ti and it ended up being RAM speed in the end.

Originally I was running RAM at the rated 4133, but that caused games to almost instantly crash (in game less than 1 minute). I put RAM at 2133 and everything was stable.

Finally, I put the XMP profile back (4133) but this time added some voltage to the RAM. I don't recall the exact numbers but it was a small bump and that fixed it.
 
I had similar problems on my 2080 Ti and it ended up being RAM speed in the end.

Originally I was running RAM at the rated 4133, but that caused games to almost instantly crash (in game less than 1 minute). I put RAM at 2133 and everything was stable.

Finally, I put the XMP profile back (4133) but this time added some voltage to the RAM. I don't recall the exact numbers but it was a small bump and that fixed it.

Good to know, thanks!

I ordered a new XMP 3600 kit to try out. Had been thinking about upgrading anyway since I have been using a cheap 3000 kit. RAM should be here tomorrow so I can see if that resolves things.
 
To rule out memory can you try setting your memory to JEDEC standard 2133MHz with the default voltage/timings?
 
Thanks for the suggestion. Gave that a shot but no difference.

Hmmm, weird. Not sure if you ordering different memory is going to help it then.

I'd maybe flash your motherboard BIOS to the latest version (AGESA 1.0.0.4B has some improvements for Zen 2 anyway) and also in the BIOS force the PCIe lane to use PCIe 3.0 speeds to see if that makes a difference.
 
Hmmm, weird. Not sure if you ordering different memory is going to help it then.

I'd maybe flash your motherboard BIOS to the latest version (AGESA 1.0.0.4B has some improvements for Zen 2 anyway) and also in the BIOS force the PCIe lane to use PCIe 3.0 speeds to see if that makes a difference.
I have flashed my motherboard BIOS to latest version. Also have tried flashing the video card BIOS/firmware with nvlfash, and with EVGA's utility.

I think I found an RTX 2070 I can borrow for a day. I plan to install that in my machine and see if it boosts properly.

Probably right about the RAM, although I was considering an upgrade to faster memory anyway so it seemed like a good time to make that purchase.
 
Just consider the new RAM is adding a new variable to the equation that may add to the instability troubleshooting
 
Out of curiosity, if you go into the event log, do you see anything for nvlddmkm? Event ID 14.
 
EVGA asked me to remove the Accelero III and reinstall the stock cooler before sending it back to them. I put the card in the freezer overnight then used some pliers to gently twist off the heatsinks that had been applied with the thermal glue included with the Accelero III.

I thought it worked really well, then after I finished cleaning it up I noticed one of the VRMs has a crack :(

I knew the risks going in when installing the Accelero III but still a bummer. I'm pretty proficient with a soldering iron, not sure if replacing the VRM is a possibility. I'd like to try and salvage the card if possible to send back to EVGA. Not sure what their repair costs will look like but worth a shot.
 
Back
Top