Graphics Card Necromancy, Continued: Radeon R9 290X

Discussion in 'Video Cards' started by RazorWind, Jun 7, 2019.

  1. RazorWind

    RazorWind 2[H]4U

    Messages:
    3,252
    Joined:
    Feb 11, 2001
    Folks! Following up on my previous thread here, we've got a new pile of dead graphics cards on the bench, and we're going to attempt to get at least one of them working.

    For those who prefer video, I made a video version.


    The cards in this case are Radeon R9 290X's - once again, we'll call them Card A, B and C.

    IMG_4067.jpg

    Card C has belonged to me for several years. I actually traded for it with a fellow [H]er, back when it was relatively current. It works perfectly fine, although as you can probably guess from the photo, fans are somehow a consumable for these things.

    Cards A and B I bought on ebay for about $20 apiece. They're both "dead," although Card B did actually produce a picture when I tested it a couple of weeks ago. Subsequent tests result in no picture, though. All three cards are identical physically, but Card A has the reference BIOS, as opposed to the Sapphire overclocked version that B and C have.

    Given the apparently intermittent nature of the problem with Card B, we're going to concentrate on Card A as our candidate for repair first. Card C will serve as a reference, since it's undamaged, and works properly.

    I've already removed Card A's heatsink. Unsurprisingly, there's no sign of physical damage to the front of the PCB.

    IMG_4070.jpg

    But if we look at the back...


    IMG_4760.jpg
    What's this? A missing cap? Hmm...

    I'd be sort of surprised if the missing cap is the reason our card doesn't work. Those tiny caps are generally there to help filter out noise in the power plane, but as someone commented in the GTX 690 thread, his card didn't work with one of the smaller ceramic SMD caps broken off, so some designs may be sensitive enough where most of the caps are actually critical.

    Before we just replace that capacitor, though, we should do some additional testing, to make sure it's related to our apparent problem. First, we'll check resistance on each of our voltage rails to see if any are shorted or open. Remember, we're looking for between 1 and 1000. Anything in that range is probably OK.

    Here's VCore. Looks OK.
    resistance_vcore.jpg

    VDDCI (AKA memory power). Also looks OK.
    resistance_vddci.jpg

    The Aux rail. Also looks sane.
    resistance_aux.jpg

    The 1.8 rail. I think this is related to the display ports... looks sane.
    resistance_1_8.jpg

    The .95 rail - I don't actually know what this does, but it's required for function. Also looks sane...
    resistance_95.jpg

    Unknown SOP-8 chip on the back of the card, pin 8, which is usually the phase pin on this type of regulator. This looks sane too, although I don't know for sure what this IC does.

    resistance_unknown.jpg


    Ok, that's all of our resistances. We didn't find any shorts, so that's good, and there's nothing with a huge resistance, which might indicate a totally open circuit. Now we need to power the card up and see which rails actually run.

    VCore - this should be 1.0 - 1.2 volts. So we know this rail isn't working.
    voltage_vcore.jpg

    VDDCI - this should be about 1.5 volts, so we also know that this rail isn't working either. Notably, this and VCore share a pretty complex controller.
    voltage_vddci.jpg

    .95 - Ok, this one is working.

    voltage_95.jpg

    1.8v Rail - This one is working too.
    voltage_1_8.jpg

    Aux - Not working. I have a feeling that this may be waiting for an enable signal from something else, maybe the memory rail. I think I mentioned in the GTX 690 thread that the output of one VRM is frequently wired up to the enable input on another, so that rails start up in a specific order.
    voltage_aux.jpg

    5V Rail - Also working. This is what powers the VRMs themselves. The controllers need power of their own, and in some cases it's also used for the gate drive of the MOSFETs.

    voltage_5.jpg

    Ok, so we've learned that something major isn't working at all. These symptoms lead me to suspect the problem lies in or around the control IC for our VCore VRM, which is shared with the VDDCI VRM. That's a pretty elaborate chip with 56 (!!!!) tiny pins. I think the next step is to look for anything simple, like power to it, or an enable signal that's missing, and for that, I need to consult the data sheet.
     

    Attached Files:

    Halon, Thevoid230, N4CR and 8 others like this.
  2. FlawleZ

    FlawleZ Gawd

    Messages:
    790
    Joined:
    Oct 20, 2010
    Subscribed (again). I hope to learn more from your endeavor.
     
  3. Bawjaws

    Bawjaws Limp Gawd

    Messages:
    434
    Joined:
    Feb 20, 2017
    I love this stuff, although it's way over my head. Looking forward to reading more!
     
    Randall Stephens likes this.
  4. Meeho

    Meeho [H]ardness Supreme

    Messages:
    4,303
    Joined:
    Aug 16, 2010
    You're the text version of Louis Rossmann
     
    N4CR, Danielv123 and auntjemima like this.
  5. THUMPer

    THUMPer 2[H]4U

    Messages:
    2,920
    Joined:
    May 6, 2008
    Make sure you do a follow up. You'll have some views shortly. :D
     
    auntjemima likes this.
  6. Thevoid230

    Thevoid230 n00b

    Messages:
    33
    Joined:
    May 7, 2019
    Was following your 690 thread and by coincidence, I just had a G92 card go dead short and blow up the mosfet in a PSU in a family computer.
     
  7. RazorWind

    RazorWind 2[H]4U

    Messages:
    3,252
    Joined:
    Feb 11, 2001
    And here I was thinking he was the video version of me... :D

    Also, I made a video of this one.

    G92 = 8800GT generation? You probably got your money's worth out of that.

    A quick update:

    Three dead rails, two controllers. One (ONSemi NCP5230) has a reasonably complete data sheet. The other (Infineon IR3567B) just has a pinout, with no explanation of what the pins do, I suspect because it's super complex and they actually program it at the factory for their customer's specific application, and it's designed to control two rails independently.

    I did a lot of probing, and as best I can tell, I'm missing an enable signal for the memory rail. The trace has a test pad I can probe, but I think the other end of it is on the opposite side of the board, so I'm going to have to rig up a way to find it. I wanted to cover this in a video, so I didn't take any pictures. Without finding the other end of the trace, I can't be sure whether it's the memory waiting on the core/aux to power up or the other way around, but I don't think both of them are actually damaged. Another possibility is that the memory rail is starting up, but has a short I haven't found, and then aborts.

    Lastly, with great respect, I think Mr. Buildzoid may be incorrect in his 290X breakdown video. He claims that the memory and core VRMs are controlled by the IR3567B, and just kind of glosses over the Aux rail. This is untrue. In reality, it's the core and aux that are controlled by the IR3567B, and the memory is controlled by the NCP5230.

    Oh yeah, also Card B just started working again, at least well enough to load windows. Not sure what's up with that, but it's convenient for testing purposes.
     
  8. RazorWind

    RazorWind 2[H]4U

    Messages:
    3,252
    Joined:
    Feb 11, 2001
    Semi-bored at work, and browsing ebay, I found this:
    https://www.ebay.com/itm/MSI-Lightn...739910?hash=item2acd954b86:g:rbQAAOSwdtBc-KJT

    And this:
    https://www.ebay.com/itm/Titan-Z-12...587991?hash=item287ed51e57:g:eCIAAOSwTPdc-vrH

    I'm kind of tempted to buy that Titan Z...

    Anyway, progress!


    I tracked down the data sheets for our control and phase drive ICs, and tested the phase drive pins for each of the large power MOSFETs on the board. We need the data sheets because two of each mosfet's three terminals are hidden under that huge drain terminal on the top, so we need to know which pins on the controller they connect to, and check there.

    The VCore ones seem sane...
    High side:
    vcore_high.jpg

    Low Side:
    vcore_low.jpg

    The memory rail has a different controller with integrated phase drivers. Its pins are SUPER tiny, so I'm not even sure I'm probing the right pins, but this all looks at least kind of sane. I don't see anything that's obviously a problem here, so I'm moving on. I'll come back to it if I can't find anything else wrong...
    mem_high.jpg
    mem_low.jpg


    Finally, I looked at the Aux rail.

    High side looks alright.
    aux_hi.jpg

    But the low side...
    aux_low.jpg

    That's, uh, not so good.


    So, at this point, we know that some part of the low side gate drive on the Aux rail is basically shorted to ground. The problem could lie in either the phase drive IC or the MOSFETs themselves, but we can't tell which it is with both of them still on the board. So, let's remove the drive IC, since it's the easier of the two.

    Flux on...
    flux.jpg
    Heat it up...
    heat.jpg

    And off it comes.
    yank.jpg

    Now, we test the resistance on the low side gate again.
    better.jpg

    That looks much better. We'll confirm our issue by testing the IC we removed.
    yep.jpg

    Same resistance value as when it was on the board. So, while the MOSFETs may also be hosed, we know this IC definitely is.

    While I could cannibalize one of the working cards, I think I'll just source a new IC. Ill probably also get a couple of fresh MOSFETs just in case, too. So, we'll reconvene once we have our spares on hand and can solder them back on the board.