My EVGA 1080 FTW has died, please help

Thalcore

n00b
Joined
Nov 24, 2021
Messages
28
Hello my name is Thalcore and my EVGA 1080 FTW ( Rev 1.1 / 1639 chip ) has died.

I have 2 of these cards in different computers and the other is running great still no problems.

During heavy gaming this card started to sometimes crash ( quite rare )

in the logs I would find a code : 141 : LiveKernelEvent
but no discription. temps would be fine. 40-60 degrees at the time of crash.

after a while it got worse, any sort of overclocking while gaming would cause a crash.
I was using the card on stock settings with fans at max

I convinced myself to try replacing all the thermal pads.

after replacing all the pads I thought I fixed the issue, it ran great on a benchmark for about 3-4 hours on stock settings.

I applied the previous overclock settings it had +50 clock / + 500 Memory

started the benchmark, it crashed and hasnt worked since.

-------------------------------------------------------------------------------------------------------------------------------------------------------------

Steps ive taken trying to solve this :

- Removed and installed thermal pads and paste making sure the gaps are right.

- Tried card in another computer this time to boot. it booted straight into the bios, and the hard drive was wiped...
now im reinstalling windows as I write this on my other PC. and attempt to figure out where this card went wrong.

- I took the card to my mining rig and plugged it in there to see if it would boot or get recognized... it melted the USB riser cable.

- I took the card apart again, got out my multimeter and started looking around online for schematics or diagrams, when i stumbled upon
this forum , please help me Gents.

- following a guide checking for shorts across the voltage rails I got to the result of it being safe to plug in and fire up because no shorts are detected.
but im scared to plug this GPU in after the last 2 attemps.
 
I cant really find a schematic or diagram anywhere to follow. could someone possibly assist and walk me through checking for a short ?
 
Since the Riser cable dosent transfer power ( from what I understand anyway ) . I dont think it made it to the GPU through the riser board. I could be wrong. but the riser board appears to be ok. the cable and pcie connector are done though.

I think it might be because I run the Motherboard off a dedicated powersupply. and the GPU's and Riserboards through another server style psu with a breakout board.

If I remember correct I powered up the GPU psu first. and it turned on the computer. it happened pretty quick, I didnt see smoke, just the smell on start up. Normally I power up the motherboard psu first and wait for the motherboard rgb's to come on before I give the cards power and boot.
 

Attachments

  • 20211125_015903.jpg
    20211125_015903.jpg
    271.2 KB · Views: 0
  • 20211125_015842.jpg
    20211125_015842.jpg
    275.1 KB · Views: 0
  • 20211125_020106.jpg
    20211125_020106.jpg
    476.2 KB · Views: 0
On the GPU

If I check resistance between ground ( At the DVI housing ) and the following :

probe to probe : 0.0 ohms

Pcie x16 12v power : 0L / open

Pcie x16 3.3v power : 0L / open

Pcie x16 ground : 0.0 ohms

8 Pin Power 12v rail inductor # 1 / L19 : 0L / open

8 Pin Power 12v rail indictor # 2 / L3500 : 0L / open

5 Volt Rail # 1 ( by SLI connector ) / L1 : 0L / open

5 Volt Rail # 2 ( by 8 pin conductor ) / L2800 : 0L / open

1.8 Volt rail / L20 : 342.2 ohms

PEX rail / L31 : 83.0 ohms
 
After watching a few videos and reading some forums.

ive come to the conclusion that :

- my 5v rail has a open / short or bypass to some other power circuit ( im information shows I should see 9k ohms + / my results : open )

- My 1.8V circuit has a short or a bypass somewhere bringing the resistance low. ( my information shows i should see 800 + ohms / my results : 342 ohms )

more to follow.
 
Ok rookie mistake guys.

my meter was not set right. and on the 5v rail. I was reading 0L / open. but in actually there is resistance. it was just outside of the what the meters setting could pick up.

I was starting to wonder why I has so many open resistors lol.

I need to change the Information moving forward.

Measuring resistance between ground and the following :

5 volt rail #1 L1 : 5k ohms rising to 20k +

5 volt rail # 2 L2800 : 5k ohms raising to 20k +

im going to say that my 5v is ok. and move on to my 1.8v circuit being low
 
I have found the following :

Resistors all appear to be ok.
 

Attachments

  • 1.8v Circuit.jpg
    1.8v Circuit.jpg
    600.5 KB · Views: 0
  • 1.8v Circuit 2.jpg
    1.8v Circuit 2.jpg
    502.9 KB · Views: 0
here is a front and backside of this U3 chip or what have yee.

im going to try and find out what I can and try and prove this as a failed component because im running out of things to check again.
 

Attachments

  • U3 backside.jpg
    U3 backside.jpg
    635.1 KB · Views: 0
  • U3.jpg
    U3.jpg
    711.7 KB · Views: 0
here are some snips I took from the before mentioned links I thought would help me in component testing.
 

Attachments

  • powering Bios Chip.jpg
    powering Bios Chip.jpg
    58.9 KB · Views: 1
  • image_2021-11-25_055225.png
    image_2021-11-25_055225.png
    188.9 KB · Views: 1
Here are the results. Im going to go for a smoke and think over this.
 

Attachments

  • U3 bios chip test results.jpg
    U3 bios chip test results.jpg
    618.7 KB · Views: 0
after thinking about this for a little bit. I think if I could verify those readings against a known good card and the numbers are the same it would mean the 1.8v circuit is ok, and the problem is in this chip.
 
Given that the behavior is somewhat intermittent, I'd give it about an 80% chance that this is a BGA failure, which you're unlikely to be able to fix if this type of troubleshooting is at all difficult for you.

That said... Take the rail resistance measurements again. You shouldn't have OLs or inifinity anywhere. The 12V inputs should be in the 10-20K range, and the 3.3V should be 200-500. If you have any voltage input circuits that read way low, troubleshoot that, but I doubt you will, given the behavior.

Also check resistance to ground on the logic power rails. You should have at least four, and I think there's a fifth too (memory aux) but I couldn't tell you much more about it. Do not use short detection mode on your meter - take actual resistance measurements.

1. GPU Core - You're looking for something like 300 mOhms. That's milliohms. 0.3 ohms. If you try to measure this in short detection mode, it will appear to be in short, even if it's perfectly fine. Only 0.0000 ohms should be considered for sure a short circuit, unless there's visible damage to something connected to this rail, like a burned power stage, or a cracked cap.
2. Memory - Tens to maybe 150.
3. 1.8V - Maybe 500?
4. .95V - Anything over 200 is probably ok

There may be two 1.8V rails. I think one is for the PCI-E power, and the other is secondary power for the memory.

Then, start the card up with as much of the heatsink removed as you can without running the core totally bare, and check for voltage on each rail. This is the real test - is each regulator producing the expected voltage? You may need to do this test before doing the resistance check, if you're not sure which regulator is which.

1. GPU Core - 0.7-1.2V
2. Memory - 1.35V

The other should be obvious.

The problem almost certainly not the bios chip circuit, if you're getting blank screens after the system boots, but it could be related to the power rail that supplies power to the BIOS chip (usually 3.3V, maybe 5V in some cases).
 
I am in electronic repairs in the past 30 years, they way that you do your analysis, it does not helping any one to pass an advice.
You better get in contact with a REAL Doctor who does electronic repairs of VGA cards.
 
Last edited:
I am in electronic repairs in the past 30 years, they way that you do your analysis, is does not helping any one to pass an advice.
You better get in contact with a REAL Doctor who does electronic repairs of VGA cards.
You're not wrong, but talking down to him won't help either. As you clearly love to remind us, you have thirty years of experience troubleshooting this sort of problem. This appears to be Thalcore's first time.
 
You're not wrong, but talking down to him won't help either. As you clearly love to remind us, you have thirty years of experience troubleshooting this sort of problem. This appears to be Thalcore's first time.

No, since now I will introduce my self, that I own 100.000 Euro worth of repairs lab, certifications, and that I do not get involved with VGA repairs.
Remote assistance this is limited to pass a tip, to someone whom can use it.
If that is not possible, then you do remind him that he is not alone in this planet, others out there can help but with face to face contact.
 
Thank you for reaching out to me.

Ok so early this morning after looking over my findings and checking between pins at the U3 bios chip. I found that pins : 3,7,8 had a resistance of 0.0 ohms. after checking this chart that makes sence since they all connect to the 1V8 AON line, I didnt find a short at this chip. making me think this chip is fine.
 

Attachments

  • Pascal_Bios_schematic.png
    Pascal_Bios_schematic.png
    235.4 KB · Views: 0
I will offer a free of charge tip, NEVER EVER overclock an GPU this sold with small number of electrical phases (PCB), and an average in performance cooler.
Now check YouTube of how to verify that Mosfet's are good or sorted.
 
Alright. so ive been working away gathering everything to start again.

With the negative probe attached to ground through my DVI housing I measured resistance at the following :

12v Rail : I noticed the resistance started at 1k ohms, would rise pretty fast to about 6k ohms, then jump to 30k ohms, then start coming down until it settled down at 10.8k and held steady.

12v Rail #1 : 10.80k ohms

12v Rail #2 : 10.87k ohms

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

5V Rail : Meter reading would rise or fall a few k ohms before they settled.

5v Rail #1 : around 21.32k ohms

5v Rail #2 : around 36.7k ohms

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


1.8v Rail : 343.7 ohms

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

PEX Rail : 82 ohms

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Memory Rail :

Memory Rail #1 : 82 ohms

Memory Rail #2 : 82 ohms

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Pcie pins : the 12v Pin starts around 1k Ohms , rises to 6k , then jumps to 30k and rises to 40k +

12v Pins : 40k Ohms

3.3v Pin : 25k
 

Attachments

  • 12v rail resistance results.jpg
    12v rail resistance results.jpg
    625 KB · Views: 0
  • 5v Rail #1 results.jpg
    5v Rail #1 results.jpg
    746.7 KB · Views: 0
  • 5v Rail #2 results.jpg
    5v Rail #2 results.jpg
    541.9 KB · Views: 0
  • Pcie x16 results.jpg
    Pcie x16 results.jpg
    632.9 KB · Views: 0
  • PEX rail results.jpg
    PEX rail results.jpg
    734 KB · Views: 0
  • Memory Rail results.jpg
    Memory Rail results.jpg
    759.1 KB · Views: 0
Alrighty !

Today I thought about what I should do next.

those readings arent terrible , but after the hard drive wipe, and riser cable incident im nervous about plugging this card into my systems.

I decided to set up a test bench. to power this card I will need to switch out the mining rig motherboard PSU with another for the x2 8 pin power cables this thing needs.

why I am bringing this up is I decided im going to quickly pull my other working EVGA 1080 FTW that should be close in production date and get proper working readings to compare and see if that points me down a lane while the rig is offline.

I talked to my girl about it and convinced her so im good to go

I will post the working card's readings as soon as I get them done up like the previous pictures.
 
Here are the results from a proven good EVGA 1080 FTW ( Rev 1.1 / Chip 1639A1 )

-------------------------------------------------------------------------------------------------------------------

Pcie :

Pcie 12v pins : 52.30k ohms

pcie 3.3v pins : 25.83k ohms

--------------------------------------------------------------------------------------------------------------------

12v Rail :

12v Rail #1 : 10.92k ohms

12v Rail #2 : 10.90k ohms

--------------------------------------------------------------------------------------------------------------------

5v Rail :

5v Rail # 1 : 21.38k ohms

5v Rail # 2 : 36.86k ohms

-----------------------------------------------------------------------------------------------------------------------

1.8v Rail :

1.8v Rail : 896 ohms

----------------------------------------------------------------------------------------------------------------------

Memory Rail :

Memory Rail # 1 : 121.60k ohms

Memory Rail # 2 : 121.60k ohms

-------------------------------------------------------------------------------------------------------------------------

PEX Rail : 91 ohms
 

Attachments

  • PEX rail results.jpg
    PEX rail results.jpg
    472.2 KB · Views: 0
  • Pcie pins card 2 results.jpg
    Pcie pins card 2 results.jpg
    463.1 KB · Views: 0
  • Memory Rail results.jpg
    Memory Rail results.jpg
    415.2 KB · Views: 0
  • 12v card 2 results.jpg
    12v card 2 results.jpg
    406.4 KB · Views: 0
  • 5v and 1.8v rail card 2 results.jpg
    5v and 1.8v rail card 2 results.jpg
    376.4 KB · Views: 0
  • 5v  rail card 2 results.jpg
    5v rail card 2 results.jpg
    409.8 KB · Views: 0
My 1.8v Rail resistance looks low to me. im going to go poke around again, also see if I can find any more information.

what is strange is all the resistors on the 1.8v line I measured on the bad card at 342 ohms. all measure 896 on this card.

the C264 , C266 , C270 , C272 resistors and the ones by U5 and U3.

like we seen on the BIOS chip where the pins all went through the 1V8 AON and showed up as 0.0 ohms at the chip terminals.

could we be going around the resistors if there is a short somewhere ?
 
Please dont laugh, but I rigged up a test bench to power this.

I found all voltage present but the 1v PEX Rail.

also during this as you can see. while seating it in the x16 slot , capacitor C163 fell off. I think board flex ?

shows markings of :

330
2R0
H36

I think its a 2V 330uf tantilum Capacitor ?? any confirmation would be great.

I will order a new one and replace.

---------------------------------------------------------------------------------------------------------------------------------------------------------

as for my problem.

Im going to look for a diagram showing the 1v PEX rail enable circuits. and check for 1.8v , 3.3v and 5v going into the MP1475 style of chip ( as im imagining ) that sends the enable signal for my PEX rail to turn on ?

If the chip is missing a voltage at one of its pins , im going to chase that.
 

Attachments

  • 20211126_091606.jpg
    20211126_091606.jpg
    517.2 KB · Views: 0
if i power up the test bench and sit in the Bios

would that be less risky for my GPU without a cooler ?

or would it be under just as much stress as sitting on windows desktop ?
 
Websites used :

- Vcore : https://repair.wiki/w/Vcore_Rail_on_Pascal_GPUs

- 1.8v : https://repair.wiki/w/1.8V_Rail_on_Pascal_GPUs.

- PEX Rail : https://repair.wiki/w/PEX_Rail_on_Pascal_GPUs.

--------------------------------------------------------------------------------------

- No PEX Rail voltage ( 1v )

- Low 1.8v Rail resistance.

--------------------------------------------------------------------------------------

- I know I have a Pgood signal from the 1.8v controller because the Vmem has voltage

- 1.8v controller gives a Pgood to the Vcore , I have Vcore voltage

- All voltages are within range.

- All resistances are within range except 1.8v ( about 300 ohms , spec 800 ohms + )

- The PEX controller regulates the 3.3V on the input to 1V on the output. It regulates that voltage through a resistive divider on the FB pin

-----------------------------------------------------------------------------------------

Could the PEX controller be bad ??

and because it outputs through a resistor ( im assuming by the name, that a resistive divider uses resistance) , it could be pulling the 1.8v circuit low through lets say a short ? ?

or the PEX controller could be shorted to ground somewhere internally and pulling down the 1.8v through the FB circuit ?

any advice would be well appreciated.

-----------------------------------------------------------------------------------------------

steps I plan to take :

- Remove the Bios Chip and check the 1.8v resistance.

- Remove the PEX controller and check the 1.8v resistance.

- Replace that capacitor

anything im missing please add in
 
So, you think you have two separate problems? One being that you have a short to ground on the auxiliary 1.8V rail, and the other being that the PEX rail (is that the 0.95V one?) isn't running?
 
Thank you so much for your response.

I feel I know just enough to be dangerous

Because of my low resistance. yes i think there is a problem on my 1.8v rail.

The Pascal wiki points me towards the Bios chip for a lucky cause of low resistance. and its a easier chip to pull than the 1v PEX rail chip.

if the resistance does not improve. I plan to pull the 1v PEX rail chip and then test again.

hopefully its just the 1v PEX rail chip causing both issues.
 
I've never diagnosed a gpu or attempted a repair on one before.

I dont even have a rework station yet.

any suggestions ? all the reviews say they blow up and catch everything on fire.
 
Aoyue hot air stations are cheap enough that you can buy one, do a few repairs, and then if you decide you want to do a lot of repairs, buy a nicer one and not be too upset about it.

I actually have two Aoyue ones, and they're both fine. Just... don't leave them running unattended, and they (probably) won't burn your house down. I really like the 968A+ because it has a pretty nice adjustable soldering iron built in too, which you also need.

Regarding the 1.8V circuit, why do you think the problem is the BIOS chip, and not a cracked MLC cap or something? The BIOS chip isn't hard to remove and reinstall, but generally, speculatively removing ICs like that should be a last resort, and other problems are more likely to cause this anyway, such as a cracked MLC cap, failure in the PCB itself, or a failure on the GPU die.

Regarding the PEX rail, have you confirmed that you have Vin and EN on its pins? Like, actually confirmed, and not just assumed based on the schematic you have for a "Pascal" card? Not all Pascal cards are exactly the same, so you can't just assume they're all wired up exactly like the schematics for the reference design say they should be. Instead, you should confirm that you have the proper inputs on your card.
 
I will look into an Aoyue thank you.

The reason I am assuming the Bios chip is strictly because of what I read on the wiki page.

I'm not thinking its its a cap, pcb, or gpu die because I can not find any sign of physical damage from temps or anything melted,

When I try and follow the circuit I get resistance through the resistor or capacitor , but 0.0 ohms along the circuit in between resistors making me feel the pcb circuits are ok.

I am not the greatest with understanding total circuit resistance with resistors in parallel, if you loose one resistor, what happens to the circuits total resistance ?

I'm nervous to check the pins at the chip because they are so small and close together I think I would absolutly short somthing.

I thought about checking at the close side of the nearest capacitor / resistor on the pins circuit. But decided since I would be assuming the pcb was good based on my 0.0 ohm test, that I could just assume the rest was good.
 
Now you got me thinking if a resistor was blown on a parallel I coudnt pick it up individually , I have to remove and test individually ? Start measuring amps and voltage drops ?
 
Last edited:
Back
Top