The Moment Your RTX 2080 Ti FE Fails

Discussion in 'HardForum Tech News' started by FrgMstr, Nov 14, 2018.

  1. FrgMstr

    FrgMstr Just Plain Mean Staff Member

    Messages:
    48,401
    Joined:
    May 18, 1997
    So we are starting to play around with our failed RTX 2080 Ti Founders Edition card this evening. Below are two pictures of the card operating on an open test bench with an ambient temperature of 68F/20C. The card ran Heaven for about 45 seconds before locking up with the same artifacts we saw when it failed last Friday while gaming. We did purchase two 2080 Ti FE cards, so we will likely start dissecting the "good" card to get some better temperature readings. It is hard to test when your $1200 video card is broken.

    Thermal images with Temperature Sensors.

    While 138F/59C on the exterior of the backplate might seem a bit excessive, we will surely find out. As a side note, the FLIR ONE Pro is on sale.
     
    IKV1476, DTN107 and DrezKill like this.
  2. Joust

    Joust 2[H]4U

    Messages:
    2,881
    Joined:
    Nov 30, 2017
    Predator-Vision. [H]ard.
     
  3. Squall_Rinoa89

    Squall_Rinoa89 Limp Gawd

    Messages:
    411
    Joined:
    May 4, 2011
    Question! What was the temps of the 1070/1080 that would self immolate? Anyone remember???
     
  4. ShuttleLuv

    ShuttleLuv [H]ardness Supreme

    Messages:
    7,050
    Joined:
    Apr 12, 2003
    Shitty QC...this reminding me of the 5870.
     
    LightsOut41 and The Mad Atheist like this.
  5. Slade

    Slade 2[H]4U

    Messages:
    2,539
    Joined:
    Jun 9, 2004
    That backplate temperature is pretty high. I was getting 122F around the GPU area and 112F at the outer edges.

    I think multi monitor mode runs the power use around 55W. Can you check if it goes into low power for a single monitor?
     
  6. pcgeekesq

    pcgeekesq [H]ard|Gawd

    Messages:
    1,403
    Joined:
    Apr 23, 2012
    For reference, mozzarella cheese melts at 55 degrees C (130F).
    Swiss cheese melts at 66 C (150F)
    Parmesan cheese melts at 83C (180F).

    So you're backplate is hot enough to melt pizza cheese, but you're going to have to up your game to melt [H]ard cheeses.
     
  7. DNMock

    DNMock Limp Gawd

    Messages:
    399
    Joined:
    Apr 16, 2015
    But you will still get Salmonella from the chicken you bbq'ed on it.
     
  8. Joust

    Joust 2[H]4U

    Messages:
    2,881
    Joined:
    Nov 30, 2017
    Mozzarella, Parmesan - after this thing lights off, all that'll be left is de brie.
     
    N4CR, honegod, Flatline and 1 other person like this.
  9. Paul_Johnson

    Paul_Johnson [H] Admin Staff Member

    Messages:
    15,863
    Joined:
    Aug 29, 2004
    That is why you use american cheese for TIM.
     
    N4CR, scojer, Armenius and 4 others like this.
  10. FrgMstr

    FrgMstr Just Plain Mean Staff Member

    Messages:
    48,401
    Joined:
    May 18, 1997
    I even made the sounds with my mouth while I was taking the pictures.

     
    Riccochet, N4CR, scojer and 2 others like this.
  11. STEvil

    STEvil 2[H]4U

    Messages:
    2,819
    Joined:
    Oct 17, 2000
    I think the VRM is failiing

    If you can hold the core at a specific temperature and step it up over time to where it fails then it may be core or memory related, but if it fails at any temperature in a semi-predictable time period (would depend how long it took the specific bad part to heat up) then part of the VRM may be at fault
     
  12. DrBorg

    DrBorg Gawd

    Messages:
    555
    Joined:
    Jan 22, 2005
    If you could measure the Vpp voltage, I think that will tell us a bunch.

    I can't find a number for the current draw, but the part that caught on fire was the Vpp power supply.

    There's a R005 resistor, measuring the voltage there will tell what the heat is about, and is likely the difference in the two boards.

    That's the current sense resistor for the Vpp power supply.

    Anything below 1.5V is likely a problem. And I'll bet a cookie the good board is there, and the bad one is below that.

    You might look at both, and see if the good board Actually has capacitors on the pads where the other one probably does not have them, just empty pads.

    There are three power sections for this chip, one for internal power, one for pin drivers, and one for internal biasing; I think the internal biasing, Vpp, is the fail.


    Thermal cameras are awesome for troubleshooting. :)


    I posted this in another thread, this is what I think happened to this card:

    Looking at the PCB photos linked in the threads, this looks like the vias overloaded when the power supply section here failed.

    There are 10 vias, looking like ~10 mil vias, so those are likely good for about 1.2A each, so 12A total.

    This would raise the temperature to 42C, if it was alone, but power adds, so Wow, no wonder it burned.

    I'm not printing that number until I check it.

    This power supply is going to be the issue, if it falls, the chips overheat, and it goes downhill fast.

    This power supply is the Vboost for the Memory chips, and if this voltage is too low, the chips will run hot, and draw more power.


    Nvidia says capacitor failures, but I don't see anywhere Near enough caps for this power level.

    This is based on a similar design on the Titan, and it has 4 large caps right beside this part, and only has one PS section; this has two. And No Big Caps.

    There seems to be two empty Capacitor pads right beside the L64 inductor label, those are pretty important. :)

    There don't seem to be ANY large caps at all, all are 0201, or o603 at most.


    I can't believe there's no listing on how much power this memory chip draws; I've never seen a datasheet without it.

    There are several references to low power, and lower voltage, but 1.25V at 10A is 12.5W, as is 1.5V at 8.33A, so lower voltage doesn't mean lower power.

    This board needed twice the power level of the Titan, after all; there's two power sections.


    Look at your boards; if all those cap pads are empty, there will be a recall.

    Here are good pix:

    https://xdevs.com/guide/evga_2080tixc/

    Anyone want to measure across theR005 resistor for me, while it's running? :D That's a 0.005 ohm resistor, and is for current sensing for the controller chip...​
     
    Last edited: Nov 15, 2018
    N4CR and Deleted member 93354 like this.
  13. hitched

    hitched Limp Gawd

    Messages:
    205
    Joined:
    Jan 12, 2011
    If the problem is power related, wont this issue do nothing but get worse when those tensor cores are switched on?

    Hell if its heat related those cores aren't going to make the card any cooler...
     
  14. lostin3d

    lostin3d [H]ard|Gawd

    Messages:
    2,040
    Joined:
    Oct 13, 2016
    A few months back I put a frozen steak in a container and left it on top of my SLI rig for a few hours to thaw. Wife kept coming in and asking why it smelled so good, hilarious.
     
    N4CR, CrimsonKnight13 and MrAhlefeld like this.
  15. magoo

    magoo [H]ardForum Junkie

    Messages:
    14,363
    Joined:
    Oct 21, 2004
    I would be more concerned about buying a good FIRE EXTINGUISHER........

    so much for "test escapes"
     
  16. BloodyIron

    BloodyIron 2[H]4U

    Messages:
    3,443
    Joined:
    Jul 11, 2005
    So let me get this straight, these 2080 RTX Ti Founder's Edition cards are just being benchmarked with... stock... settings? And failing?

    If that's so, then that's a fucking rip off.
     
    thenapalm likes this.
  17. Hagrid

    Hagrid [H]ardForum Junkie

    Messages:
    8,398
    Joined:
    Nov 23, 2006
    Now imagine people overclocking and a power mod.
     
  18. xmadror

    xmadror Gawd

    Messages:
    695
    Joined:
    Feb 13, 2012
  19. Guarana [BAWLS]

    Guarana [BAWLS] [H]ard|Gawd

    Messages:
    1,787
    Joined:
    Oct 3, 2001
    If you get the chicken up to the 62-64C range throughout, you're actually safe....
     
  20. pcgeekesq

    pcgeekesq [H]ard|Gawd

    Messages:
    1,403
    Joined:
    Apr 23, 2012
    Just leave the chicken exposed to your Cobalt-60 gamma ray source for a bit, and you're safe.
     
  21. NoOther

    NoOther [H]ardness Supreme

    Messages:
    6,479
    Joined:
    May 14, 2008
    That backplate does seem pretty hot. I certainly wouldn't want to put dual GPUs in with that kind of heat.

    Also, sadly that camera is not compatible with my phone.

    Just need to get that Government cheese, everyone knows Government cheese doesn't melt.
     
  22. Hagrid

    Hagrid [H]ardForum Junkie

    Messages:
    8,398
    Joined:
    Nov 23, 2006
    We will know soon from heatlesssun since he bought 2 I believe. Maybe he can give us some numbers on heat and how close they are on his MB.
     
  23. pcgeekesq

    pcgeekesq [H]ard|Gawd

    Messages:
    1,403
    Joined:
    Apr 23, 2012
    You misspelled "Gub'min cheez."
     
    NoOther likes this.
  24. FrgMstr

    FrgMstr Just Plain Mean Staff Member

    Messages:
    48,401
    Joined:
    May 18, 1997
    Got the next RTX 2080 Ti FE on the test bench. This is on an open test bench after 1.5 runs of Heaven.

    flir_20181119T015253.jpg
     
    Dayaks, lostin3d, N4CR and 1 other person like this.
  25. FrgMstr

    FrgMstr Just Plain Mean Staff Member

    Messages:
    48,401
    Joined:
    May 18, 1997
    It seems to level off really quick. Just checked after an hour run and temps are pretty much the same.
     
  26. Spartacus

    Spartacus [H]ard|Gawd

    Messages:
    1,915
    Joined:
    Apr 29, 2005
  27. FrgMstr

    FrgMstr Just Plain Mean Staff Member

    Messages:
    48,401
    Joined:
    May 18, 1997
    We are actually going to do an article. Just some fun shots here I wanted to share.
     
    lostin3d likes this.
  28. FrgMstr

    FrgMstr Just Plain Mean Staff Member

    Messages:
    48,401
    Joined:
    May 18, 1997
    That overlay is awesome. Would you send me over the PSD file that you used? Kyle@HardOCP.com Pretty please. :)
     
  29. FrgMstr

    FrgMstr Just Plain Mean Staff Member

    Messages:
    48,401
    Joined:
    May 18, 1997
    My what a lot of thermal pads you have...

    IMG_20181119_185108.jpg
     
  30. Nobu

    Nobu 2[H]4U

    Messages:
    3,192
    Joined:
    Jun 7, 2007
    At least we know the vrm isn't overheating...probably. The card looks scary hot near the PCIe slot, though. Could that be an issue? The pins couldn't desolder from that amount of heat–would have to be more, right?