Dead 4090 FE?

LittleBuddy

Limp Gawd
Joined
Jan 3, 2023
Messages
470
So a few days ago I was gaming. My PC shut off in an instant. I did some troubleshooting for about 6 hours, did the typical troubleshooting steps I always do which consist of:
  • Testing 1 stick of RAM at a time (no post)
  • Reset the bios to defaults via the jumper (also performed a bios flashback, no post)
  • Looked at each of the power cables (they were all fine, including the 12VHPWR)
  • Inspected the PCI-E port, PCI-E riser cable, and GPU PCI-E connector (all were fine)
  • Inspected the motherboard, as far as I can tell everything looks fine, I did not check the CPU and Socket because I don't have thermal paste on hand, I did order some, arriving today.
  • Tested DP & HDMI on the GPU.
  • Removed the GPU
    • Removing the GPU allowed the PC to boot via the APU
      • Since the APU worked I put in my 3090 and it booted just fine.
        • I did use DDU, then re-install the latest drivers on the 3090
        • I am noticing massive performance issue in some games, like framerates are highly variable, I was playing a unity game (Subnautica) and the framerates were anywhere between ~150 FPS with drops into the 30s, I've paid attention to the GPU usage and it stays at 100% during these dips
          • The performance issues make me think something else might be wrong with the system that I am overlooking since I haven't had these issues in the past.
I did order a new PSU that is arriving today (I was going to buy a new PSU anyways before the end of the year, ordered the SF850).

Is there any further testing I should do to confirm it is the 4090? I have started the RMA process with Nvidia, but since I have the PSU arriving today I feel I might just give it one last try before shipping it out.
 
Last edited:
I think you've done the proper tests. Works with integrated and discrete GPUs so you know it's not the CPU or mobo. Not much else to do at this point.
 
You could break out the multimeter, but that would probably just help you to find out which component on the card has failed. Though you could at the very least monitor the 12v wires at the gpu.

What are the logs/charts for the gpu's voltage and clock showing?
 
What are the logs/charts for the gpu's voltage and clock showing?
Voltage and clocks are normal on the 3090 (1800Mhz-2000Mhz boost, 900mV to 1.1V), its just the frame time graph, 1% low, and .1% low that is awful. Maybe I am just not used to the 3090, and it was that much worse than the 4090.

I'll just ship it out today it seems like I might have done all I can. It would be interesting to tear down the GPU and see if anything is apparent but rather not void the warranty right now.
 
Last edited:
If the system shut down completely when it went bad, and removing the graphics card causes it behave normally, then you can be about 90% sure that the problem is the graphics card.

Given the symptoms, I would suspect a failure of a high-side power FET in one of the voltage regulators. You can probably confirm this by checking with a multimeter in resistance mode from each 12V input. What you'll probably find is that you have 750+ ohms on most of them, and something like 0.1-0.5 ohms on one of them. If this is what you find, then you know that one of the VRMs fed by that 12V input has failed, and shorted the 12V to its output, and what happens when you push the power button is that the power supply sees a bazillion amps of output current, triggering the overcurrent protection shutdown.

I would assume a 4090 is still under warranty, so I'd just RMA it. If it were somehow out of warranty, the next step would be to pull the heatsink off and start looking for clues as to which power phase is damaged.
 
If the system shut down completely when it went bad, and removing the graphics card causes it behave normally, then you can be about 90% sure that the problem is the graphics card.

Given the symptoms, I would suspect a failure of a high-side power FET in one of the voltage regulators. You can probably confirm this by checking with a multimeter in resistance mode from each 12V input. What you'll probably find is that you have 750+ ohms on most of them, and something like 0.1-0.5 ohms on one of them. If this is what you find, then you know that one of the VRMs fed by that 12V input has failed, and shorted the 12V to its output, and what happens when you push the power button is that the power supply sees a bazillion amps of output current, triggering the overcurrent protection shutdown.

I would assume a 4090 is still under warranty, so I'd just RMA it. If it were somehow out of warranty, the next step would be to pull the heatsink off and start looking for clues as to which power phase is damaged.
I see why you would think that but unfortunately it's not the case for my 4090, the system would turn on and the mobo LED would go from Orange to White as if it was going to boot, the fans in the system and the GPU would turn on, but it would just remain a black screen. Initially I thought it was the HDMI port, but tested DP, and it didn't work either. I'm not an electrical engineer or anything, far from it, so I'm likely wrong. I suspect the GPU itself just went bad, I wasn't pushing the GPU hard in the last week, been playing a few older games that typically pull around 80-180W on the GPU. If a cap was blown or a VRM failed, I didn't hear or smell anything out of the ordinary. I don't know, last week I was having a strange issue with Tekken 8 where my screen would go black for a few seconds during a loading screen, I thought it was just the game acting up but it could have been the first signs of this failure now that I think about it.
 
does sound like a dead gpu.

I’ve seen so many repair videos where these boards develop cracks around the area between the back latch of the pcie slot and the bottom right corner of the core (closest corner to the latch). Seems to be a troubled area I guess that can cause the core to need a reball. Not saying yours is, I’ve just seen some. Northwest repair has a video I believe that was in depth.

Also the SF750 is actually quieter and preferred to the SF850 in most instances. It’s rated better by customers and is ultra reliable even for a 4090.

I’d take a SF750 before I went with the 850, sounds weird but look into it.
 
I see why you would think that but unfortunately it's not the case for my 4090, the system would turn on and the mobo LED would go from Orange to White as if it was going to boot, the fans in the system and the GPU would turn on, but it would just remain a black screen. Initially I thought it was the HDMI port, but tested DP, and it didn't work either. I'm not an electrical engineer or anything, far from it, so I'm likely wrong. I suspect the GPU itself just went bad, I wasn't pushing the GPU hard in the last week, been playing a few older games that typically pull around 80-180W on the GPU. If a cap was blown or a VRM failed, I didn't hear or smell anything out of the ordinary. I don't know, last week I was having a strange issue with Tekken 8 where my screen would go black for a few seconds during a loading screen, I thought it was just the game acting up but it could have been the first signs of this failure now that I think about it.
You don't always get a violent pop or smell or anything when a FET goes bad. Sometimes the FET just melts inside, creates the short, and the system shuts down. When it initially failed, did the system actually shut down completely, or did you just get a black screen?
 
Seems like it is dead, bad luck I guess. Let us know how the RMA process with Nvidia goes! I had to RMA my 2080 Ti back when it was just 2 months old due to the "space invaders" issue and they were excellent about it, I got an advanced RMA with a brand new card just 2 days later. These days though, I'm hearing that the RMA situation for Nvidia isn't as great as it used to be.
 
does sound like a dead gpu.

I’ve seen so many repair videos where these boards develop cracks around the area between the back latch of the pcie slot and the bottom right corner of the core (closest corner to the latch). Seems to be a troubled area I guess that can cause the core to need a reball. Not saying yours is, I’ve just seen some. Northwest repair has a video I believe that was in depth.

Also the SF750 is actually quieter and preferred to the SF850 in most instances. It’s rated better by customers and is ultra reliable even for a 4090.

I’d take a SF750 before I went with the 850, sounds weird but look into it.
I use a riser cable, the GPU is securely mounted with no sag.

Are you mistaking the SF850L for the SF850? The SF850 had a great review by hardware busters: https://hwbusters.com/psus/corsair-sf850-atx-v3-1-psu-review/
The SF850 is the around the same noise level as the SF750
1720717362015.png


I've never really seen my PC use more than 590W in normal usage, I have had it around 720W during benchmarking with the GPU at a 600W limit.

Seems like it is dead, bad luck I guess. Let us know how the RMA process with Nvidia goes! I had to RMA my 2080 Ti back when it was just 2 months old due to the "space invaders" issue and they were excellent about it, I got an advanced RMA with a brand new card just 2 days later. These days though, I'm hearing that the RMA situation for Nvidia isn't as great as it used to be.
So far its been going good, I spoke with them last night via the chat on the support site, then followed up with the ticket through the ticket system/e-mail. They already approved the RMA, provided me the shipping label (QR Code for FedEx) and told me I'll have a replacement within 5 business days of them receiving the card. I'll update if anything is off about it though.
 
I use a riser cable, the GPU is securely mounted with no sag.

Are you mistaking the SF850L for the SF850? The SF850 had a great review by hardware busters: https://hwbusters.com/psus/corsair-sf850-atx-v3-1-psu-review/
The SF850 is the around the same noise level as the SF750
View attachment 664703

I've never really seen my PC use more than 590W in normal usage, I have had it around 720W during benchmarking with the GPU at a 600W limit.


So far its been going good, I spoke with them last night via the chat on the support site, then followed up with the ticket through the ticket system/e-mail. They already approved the RMA, provided me the shipping label (QR Code for FedEx) and told me I'll have a replacement within 5 business days of them receiving the card. I'll update if anything is off about it though.
Indeed Corsair strikes again. I’d take the sf750 over the sf850L.
 
I’m also surprised they didn’t say to remove the riser, unless OP just didn’t say he was using a riser.

They have been the source of trouble for sure. Either way it’s good practice to eliminate any non essential factors.
Well I thought it would be obvious with my build in my signature, it literally starts with A4-H2O. But at the same time I thought replacing the riser with my backup riser was enough testing without needing to tear the build apart.
 
So a few days ago I was gaming. My PC shut off in an instant. I did some troubleshooting for about 6 hours, did the typical troubleshooting steps I always do which consist of:
  • Testing 1 stick of RAM at a time (no post)
  • Reset the bios to defaults via the jumper (also performed a bios flashback, no post)
  • Looked at each of the power cables (they were all fine, including the 12VHPWR)
  • Inspected the PCI-E port, PCI-E riser cable, and GPU PCI-E connector (all were fine)
  • Inspected the motherboard, as far as I can tell everything looks fine, I did not check the CPU and Socket because I don't have thermal paste on hand, I did order some, arriving today.
  • Tested DP & HDMI on the GPU.
  • Removed the GPU
    • Removing the GPU allowed the PC to boot via the APU
      • Since the APU worked I put in my 3090 and it booted just fine.
        • I did use DDU, then re-install the latest drivers on the 3090
        • I am noticing massive performance issue in some games, like framerates are highly variable, I was playing a unity game (Subnautica) and the framerates were anywhere between ~150 FPS with drops into the 30s, I've paid attention to the GPU usage and it stays at 100% during these dips
          • The performance issues make me think something else might be wrong with the system that I am overlooking since I haven't had these issues in the past.
I did order a new PSU that is arriving today (I was going to buy a new PSU anyways before the end of the year, ordered the SF850).

Is there any further testing I should do to confirm it is the 4090? I have started the RMA process with Nvidia, but since I have the PSU arriving today I feel I might just give it one last try before shipping it out.
What cable are you using for the 12VHPWR?

This is the expected behavior if the card thinks it is receiving insufficient power. i.e. the 4 sense pins in the small flat connector on the side of the 12VHPWR connector. Inspect the sense pins. If you are using the original adapter, I would buy a new adapter/cable that fits your PSU. If you are using a 12VHPWR specific cable, try the included adapter instead.

Even if the main power supplying contacts in the 12VHPWR cable look fine, the sense pins can cause problems. When one of them disconnects, the card believes insufficient wires are in use, and powers off to prevent overheating/melting.
 
What cable are you using for the 12VHPWR?

This is the expected behavior if the card thinks it is receiving insufficient power. i.e. the 4 sense pins in the small flat connector on the side of the 12VHPWR connector. Inspect the sense pins. If you are using the original adapter, I would buy a new adapter/cable that fits your PSU. If you are using a 12VHPWR specific cable, try the included adapter instead.

Even if the main power supplying contacts in the 12VHPWR cable look fine, the sense pins can cause problems. When one of them disconnects, the card believes insufficient wires are in use, and powers off to prevent overheating/melting.
Cablemod 600W (3x8pin), but I also tested my Corsair (2x8 pin) one as well. These are cables from launch, my 4090 is a launch 4090 so it doesn't have the shortened sense pins.

Edit:
I will do a test with the new PSU today for sure with all new cables, I do wonder if something got f'd up with the SF750, I have put it through hell over the last 5 years.
 
Last edited:
Well I thought it would be obvious with my build in my signature, it literally starts with A4-H2O. But at the same time I thought replacing the riser with my backup riser was enough testing without needing to tear the build apart.
I’d just eliminate as many factors as possible.

I dreaded taking my Fractal Tera build apart but it turned out I had a riser cable issue myself. It was replaced under warranty.
 
I’d just eliminate as many factors as possible.

I dreaded taking my Fractal Tera build apart but it turned out I had a riser cable issue myself. It was replaced under warranty.
Yeah eliminating as many factors as possible is the best route, I was just being lazy and thought my backup cable would be enough, I always keep a backup riser since I had one fail in the past and had to wait like a week before the replacement showed up for my p3.
 
Yeah eliminating as many factors as possible is the best route, I was just being lazy and thought my backup cable would be enough, I always keep a backup riser since I had one fail in the past and had to wait like a week before the replacement showed up for my p3.
I ultimately stopped using cases with risers all together because of some issues I had that seemed hard to pinpoint at the time.

Turns out a small Matx build was cheaper, not much larger, much easier to cool and also in general easier to build in. I was all about itx for about 2 years. They look awesome. They are also a pain in the ass sometimes.
 
I ultimately stopped using cases with risers all together because of some issues I had that seemed hard to pinpoint at the time.

Turns out a small Matx build was cheaper, not much larger, much easier to cool and also in general easier to build in. I was all about itx for about 2 years. They look awesome. They are also a pain in the ass sometimes.
If it ends up being the riser(s). I might just get the A3-mATX honestly, the A4-H2O is an awesome case but sometimes it drives me insane, one reason I didn't want to tear it apart.

After I finish up my work for the day I'll start tearing my build apart and test it case-less in-slot, if that resolves the issue I'm probably done with mITX. Or I could get a Ncase M2 since it has a riser-less ITX design.
 
Cablemod 600W (3x8pin), but I also tested my Corsair (2x8 pin) one as well. These are cables from launch, my 4090 is a launch 4090 so it doesn't have the shortened sense pins.

Edit:
I will do a test with the new PSU today for sure with all new cables, I do wonder if something got f'd up with the SF750, I have put it through hell over the last 5 years.
How long did the 750W PSU work with the 4090?

If the 750 is working with a 3090, plus you've tried 2 different 12VHPWR cables, then the card is probably dead. 3090 and 4090 pull pretty close to the same power. And they only pull their max TPD at a full load. Can you game on the 3090? If that also crashes, then PSU would be likely suspect.
RMA time. FE's have 3 year warranty.
 
How long did the 750W PSU work with the 4090?

If the 750 is working with a 3090, plus you've tried 2 different 12VHPWR cables, then the card is probably dead. 3090 and 4090 pull pretty close to the same power.
RMA time. FE's have 3 year warranty.
I got my 4090 in early November 2022, the SF750 has worked fine with it since then, no issues until recently.

Yea, I did get the RMA already to go, shipping the GPU out tomorrow since its better for me to pack it at work in the mailroom and the FedEx office is nearby.

I agree, it's dead. I doubt it's the riser cable, but going to test without it regardless, also will test the new PSU just to be certain.

I appreciate all the responses, was just seeking more things to test that I may have overlooked.
 
How long did the 750W PSU work with the 4090?

If the 750 is working with a 3090, plus you've tried 2 different 12VHPWR cables, then the card is probably dead. 3090 and 4090 pull pretty close to the same power. And they only pull their max TPD at a full load. Can you game on the 3090? If that also crashes, then PSU would be likely suspect.
RMA time. FE's have 3 year warranty.
The SF750 has powered the 3090 and 4090 for many people. I’ve seen plenty itx builds with it and it’s almost the go to psu for builds needing a sfx psu with a beefy gpu.

I used one on a 7900xtx without issue.

I’m not saying it can’t be a problem, anything can. But the amount of builds I see it in is astounding. Some have been running since the 3090 came out according to Reddit.
 
Just an update:

I received my replacement 4090 today, so the total time it took was 13 days. The replacement card came in a very well packed large brown box, and required a signature from FedEx. Fortunately my neighbor was home and handled that because I was in-office today. Nvidia did not give me tracking information or any notice that it was being shipped out, so I assumed it was not shipped yet. Anyways, the replacement is refurbished, it said so on the anti-static bag. I first thought it might have been my 4090 but fixed, so I checked the serial number and can confirm it is not the one I sent it. Either way it is in excellent condition just as the one I shipped them. Overall I am satisfied with the RMA process, other than them not sending me a notice that it was shipped out.

I installed the card, it works just fine. It does not have the shortened sense pins like the new connectors, so whatever happened to it with it's original owner was not the connector. It also doesn't have the 1070mV limit, like the new ones. So this weekend I will do some testing on it and push it to it's limits, maybe I'll try to set a new 3D Mark high.

Also an update on the SF850 PSU and fan noise concerns. In the last 2 weeks of using the SF850 with my 3090 the fan has never kicked on, if it has I did not hear it.
 
So a few days ago I was gaming. My PC shut off in an instant. I did some troubleshooting for about 6 hours, did the typical troubleshooting steps I always do which consist of:
  • Testing 1 stick of RAM at a time (no post)
  • Reset the bios to defaults via the jumper (also performed a bios flashback, no post)
  • Looked at each of the power cables (they were all fine, including the 12VHPWR)
  • Inspected the PCI-E port, PCI-E riser cable, and GPU PCI-E connector (all were fine)
  • Inspected the motherboard, as far as I can tell everything looks fine, I did not check the CPU and Socket because I don't have thermal paste on hand, I did order some, arriving today.
  • Tested DP & HDMI on the GPU.
  • Removed the GPU
    • Removing the GPU allowed the PC to boot via the APU
      • Since the APU worked I put in my 3090 and it booted just fine.
        • I did use DDU, then re-install the latest drivers on the 3090
        • I am noticing massive performance issue in some games, like framerates are highly variable, I was playing a unity game (Subnautica) and the framerates were anywhere between ~150 FPS with drops into the 30s, I've paid attention to the GPU usage and it stays at 100% during these dips
          • The performance issues make me think something else might be wrong with the system that I am overlooking since I haven't had these issues in the past.
I did order a new PSU that is arriving today (I was going to buy a new PSU anyways before the end of the year, ordered the SF850).

Is there any further testing I should do to confirm it is the 4090? I have started the RMA process with Nvidia, but since I have the PSU arriving today I feel I might just give it one last try before shipping it out.
bake it baby!!
 
So far the new card is looking a bit better than my original:

Original 4090: +135 Mhz Core, +1700 VRAM (artifacts at 1800 in superposistion)
New 4090: +255 Core, +1900 VRAM (artifacts at 2000 in superposistion)

I'm about to do a Metro Exodus Enhanced run to confirm it's core stability, each run takes about 40 mins, so it could take half a day to find the stable core, then might need to install TW3 NG to verify. Might be a near-golden sample referb.

I kind of figured it was better than my old 4090 when playing Tekken 8 and it was drawing 60W less at the same settings, at max settings with DLSS .85 scaling my OG would pull ~180W, on the new card it pulls ~120W.

Edit:
Ended up being stable at +150 Mhz Core, +1825VRAM. Still a slight upgrade. Beat my old port royal score by like 20pts :facepalm:
 
Last edited:
Back
Top