Weird Issues with semi-working RTX 3080

Andrew_Carr

2[H]4U
Joined
Feb 26, 2005
Messages
2,573
Not sure what's wrong with this card so I was wondering if anyone had ideas. It's a Zotac OEM model RTX 3080 that looks like one of these kind of: https://www.techpowerup.com/vgabios/225496/zotac-rtx3080-10240-200901

Anyway, I took a risk and bought it used from someone who said it was broken. I plugged it into a linux HiveOS machine and it was mining fine at 97MH/s and temps seemed fine (about 60C iirc). The next day it wasn't hashing anymore but it's still detected by the OS. If I plug it into a windows computer it will boot and display an output and seems to work fine, but the resolution is limited to 1024x768 and any other resolution is greyed out. When the resolution is increased such as after installing the drivers I get a black screen, but I don't get a "no display" error from my monitor. Also, temp sensors seem to be missing if I check in hwinfo64 or gpuz. When I run it in linux the temps on the core and memory are usually around mid 30C at idle.

I was thinking maybe something was damaged but I've taken it apart and everything seems fine, nothing burnt or anything obvious. Temps also seem fine so I'm not sure what's going on. Usually I figure these things work or they don't, not that they work somewhat like in this situation.

Update: Tried extracting the vbios again using GPUz, but it black screens.

Here's what GPUz and hwinfo64 see:
 

Attachments

  • gpuz rtx 3080 issue.png
    gpuz rtx 3080 issue.png
    31.6 KB · Views: 0
  • rtx 3080 hwinfo64 temps.png
    rtx 3080 hwinfo64 temps.png
    4.5 KB · Views: 0
  • hwinfo main screen.png
    hwinfo main screen.png
    64.5 KB · Views: 0
Last edited:

Andrew_Carr

2[H]4U
Joined
Feb 26, 2005
Messages
2,573
Update: Seems to be fixed. I put it in a separate rig with a new PSU, motherboard, everything, removed my power limits on the card, and it's working again now, might've just been a power thing I guess.
 

RazorWind

Supreme [H]ardness
Joined
Feb 11, 2001
Messages
4,271
Was there any evidence of someone having messed with it prior to coming to you? (Missing warranty sticker?)
 

Andrew_Carr

2[H]4U
Joined
Feb 26, 2005
Messages
2,573
Yeah, the previous owner sold it as junk and non-working and said it was recognized in the BIOS but wasn't working. I opened it up to see if it was something simple and it looks like they had already opened it once (sticker missing on screw) and replaced some of the memory thermal pads (the OEM ones were black and plasticky on the backplage, but the ones on the vram etc. were grey and I figure were gelid or thermaltake pads).
 

pippenainteasy

[H]ard|Gawd
Joined
May 20, 2016
Messages
1,066
How much did you pay? Wonder if it was possible to get a warranty replacement and just claim it was a gift...
 

Andrew_Carr

2[H]4U
Joined
Feb 26, 2005
Messages
2,573
I spoke too soon. After an hour it quit on me again. Card is still recognized so I probably have to fix something with something overheating I guess. The trouble is that it's an OEM card with different pad thicknesses than the regular trinity so I roughly compared the pads before replacement but have no reference documents to go off of, and Zotac wouldn't help me with that (or an RMA). I paid $1300 or about half the going rate at the time. My latest buy, a $400 Radeon VII, isn't working either despite looking like it's NIB and untouched so I might have another weekend of tinkering ahead with these cards.
 

pippenainteasy

[H]ard|Gawd
Joined
May 20, 2016
Messages
1,066
$400 vega 7 is an impressive price for a working card given its hashrate is the same as a 3080...too bad it's not working. You could probably still sell for a profit on eBay for parts or not working.
 

Andrew_Carr

2[H]4U
Joined
Feb 26, 2005
Messages
2,573
Yeah, it came from what looked like a big box reseller. The box was still sealed, zero dust/dirt on the card, and when I took it apart just now it looks totally unused (thermal stuff on heatsink is still in mint condition and thermal pads look perfect). No signs of disassembly so I'll have to try to see if I can get it working by maybe using a washer mod or something and lowering temps. The card is recognized by my computer at least sometimes, so it doesn't seem to be totally dead. Probably going to try liquid metal on these to see if it fixes any temp related issues even though temps when the RTX 3080 were mining were in the 70s.
 

Falkentyne

[H]ard|Gawd
Joined
Jul 19, 2000
Messages
1,823
Yeah, it came from what looked like a big box reseller. The box was still sealed, zero dust/dirt on the card, and when I took it apart just now it looks totally unused (thermal stuff on heatsink is still in mint condition and thermal pads look perfect). No signs of disassembly so I'll have to try to see if I can get it working by maybe using a washer mod or something and lowering temps. The card is recognized by my computer at least sometimes, so it doesn't seem to be totally dead. Probably going to try liquid metal on these to see if it fixes any temp related issues even though temps when the RTX 3080 were mining were in the 70s.

Did you try cleaning the PCIE Slot pins with Deoxit D5? I've seen bad contact cause issues like that.

Also, try disassembling the card fully and completely.
Buy some 99% isopropyl alcohol.
Something like this should work (maybe you can find something similar locally, or a better deal).

https://www.amazon.com/gp/product/B07L6MMV7F/

Douse and submerge the board in it.
See if that fixes it.
I know one person who had a card acting up like this, because when he un-did a conductive paint shunt mod, there was MG 842AR residue flakes bridging something, which became visible floating in the iso bath.
card worked 100% perfectly after that bath.
 

Andrew_Carr

2[H]4U
Joined
Feb 26, 2005
Messages
2,573
Hmm.. thanks, just picked some up so I'll give that a shot. So far the D5 has removed a lot of corrosion from the pcie slot. Didn't see anything and I had cleaned it with alcohol previously, but one side seemed to have a decent amount of corrosion.
 

Andrew_Carr

2[H]4U
Joined
Feb 26, 2005
Messages
2,573
Did you try cleaning the PCIE Slot pins with Deoxit D5? I've seen bad contact cause issues like that.

Also, try disassembling the card fully and completely.
Buy some 99% isopropyl alcohol.
Something like this should work (maybe you can find something similar locally, or a better deal).

https://www.amazon.com/gp/product/B07L6MMV7F/

Douse and submerge the board in it.
See if that fixes it.
I know one person who had a card acting up like this, because when he un-did a conductive paint shunt mod, there was MG 842AR residue flakes bridging something, which became visible floating in the iso bath.
card worked 100% perfectly after that bath.

Ok, just tried all of that. No luck, still the same situation as before. I'm starting to think it's an issue with the heatsink not contacting the chip well enough. That might explain why the card performs as expected in mining (when it works) and works fine until you load a game. Temps seem ok but maybe they're spiking somewhere on the chip and it's throttling? I have a new batch of thermal pads arriving soon so I'll be able to try all sorts of combinations.
 

RazorWind

Supreme [H]ardness
Joined
Feb 11, 2001
Messages
4,271
Ok, just tried all of that. No luck, still the same situation as before. I'm starting to think it's an issue with the heatsink not contacting the chip well enough. That might explain why the card performs as expected in mining (when it works) and works fine until you load a game. Temps seem ok but maybe they're spiking somewhere on the chip and it's throttling? I have a new batch of thermal pads arriving soon so I'll be able to try all sorts of combinations.
What does the grease spread on the die look like?
 

Andrew_Carr

2[H]4U
Joined
Feb 26, 2005
Messages
2,573
What does the grease spread on the die look like?
I'll take it apart tomorrow and share some pictures. I was putting a lot of thermal paste one earlier but this time I didn't put so much on. I'm down to 0.5mm and 1mm thermal pads on most of the components now so I was thinking the core temps would be ok at least but who knows.
 

Andrew_Carr

2[H]4U
Joined
Feb 26, 2005
Messages
2,573
So I'm hoping I fixed it finally. I took it apart again today and it looked like this. I repasted the die and put extra paste on the heatsink just to see if that would help. Then when I removed the metal frame I noticed one of the thermal pads was extending over a vram chip and the edge looked like it had been pushed over the silver bracket surrounding the die, so maybe that was causing poor contact? Anyway, after repasting and putting in a test rig it's mining fine and hasn't had any issues. Will see how long it lasts this time.
 

Attachments

  • thermal paste on die rtx 3080 6 8 2021.jpg
    thermal paste on die rtx 3080 6 8 2021.jpg
    423.4 KB · Views: 0
  • thermal paste on die and hsf.jpg
    thermal paste on die and hsf.jpg
    528.5 KB · Views: 0

Andrew_Carr

2[H]4U
Joined
Feb 26, 2005
Messages
2,573
Update: Card worked fine for about 3 hours, temps were all low, then it stopped again. Back to being recognized as a generic card and it seems like the drivers not working. New thermal pads arrive soon so I can try all types of combinations soon I guess.
 

RazorWind

Supreme [H]ardness
Joined
Feb 11, 2001
Messages
4,271
So I'm hoping I fixed it finally. I took it apart again today and it looked like this. I repasted the die and put extra paste on the heatsink just to see if that would help. Then when I removed the metal frame I noticed one of the thermal pads was extending over a vram chip and the edge looked like it had been pushed over the silver bracket surrounding the die, so maybe that was causing poor contact? Anyway, after repasting and putting in a test rig it's mining fine and hasn't had any issues. Will see how long it lasts this time.

Clean the grease off of the heatsink and die, and check the mating surfaces for flatness with the best straight edge and feeler gauges you can get your hands on. It looks to me like the mating surface of that vapor chamber is really convex (or the die is).

Solution being to lap or replace that heatsink. I'd use it as an excuse to go to water cooling, but maybe that's not practical for mining.

Edit: I don't think you're going to meet with much success just swapping thermal pads.
 

Andrew_Carr

2[H]4U
Joined
Feb 26, 2005
Messages
2,573
Clean the grease off of the heatsink and die, and check the mating surfaces for flatness with the best straight edge and feeler gauges you can get your hands on. It looks to me like the mating surface of that vapor chamber is really convex (or the die is).

Solution being to lap or replace that heatsink. I'd use it as an excuse to go to water cooling, but maybe that's not practical for mining.

Edit: I don't think you're going to meet with much success just swapping thermal pads.

Hmm, yeah, hadn't checked that yet. Maybe it's uneven. I don't think watercooling is an option no matter the cost because it's a weird OEM style board, but I can try to smooth out the heatsink.
 

Woot910

Weaksauce
Joined
Feb 19, 2018
Messages
66
good lord that's a lot of paste. I try to add what is needed so that the surround SMT components are not covered at all...
So I'm hoping I fixed it finally. I took it apart again today and it looked like this. I repasted the die and put extra paste on the heatsink just to see if that would help. Then when I removed the metal frame I noticed one of the thermal pads was extending over a vram chip and the edge looked like it had been pushed over the silver bracket surrounding the die, so maybe that was causing poor contact? Anyway, after repasting and putting in a test rig it's mining fine and hasn't had any issues. Will see how long it lasts this time.
 

RazorWind

Supreme [H]ardness
Joined
Feb 11, 2001
Messages
4,271
good lord that's a lot of paste. I try to add what is needed so that the surround SMT components are not covered at all...
This appears to have a heat spreader thing that covers the memory ICs, which then requires thermal paste between it and the actual heatsink.
Hmm, yeah, hadn't checked that yet. Maybe it's uneven. I don't think watercooling is an option no matter the cost because it's a weird OEM style board, but I can try to smooth out the heatsink.
Maybe this would fit it?
https://www.ekwb.com/shop/ek-quantum-vector-trinity-rtx-3080-3090-d-rgb-nickel-plexi
 

Andrew_Carr

2[H]4U
Joined
Feb 26, 2005
Messages
2,573
This appears to have a heat spreader thing that covers the memory ICs, which then requires thermal paste between it and the actual heatsink.
Yeah, that's exactly what it is. Thermal pads connect it to the VRAM modules but it's just a metal bar to prevent flexing so I wasn't sure if it was enough of a heatsink. And since core temps were really low I used some thermal paste to try to get a better connection between it and the heatsink. Doesn't seem to be doing much. I also picked up one of these to replace the backplate and try to better cool the VRAM. It didn't fit exactly but I got it on there somehow, and after lapping the heatsink for the core I reinstalled everything and tried again. Card crashed again after about an hour of mining even though temps were low, and now it's back to being a zombie card. Just put it in the new ultrasonic cleaner so I'll put it together again and see how it does now.
 

Attachments

  • rtx 3080 copper heatsink vram.jpg
    rtx 3080 copper heatsink vram.jpg
    327.4 KB · Views: 0

Shadowarez

Gawd
Joined
Jul 8, 2019
Messages
868
if these are still non functional could a;ways try Northridge fix pretty sure he has more cards in for repair then a rma center at this point but he works on any card from 1060s-3090's from canada to Dubai best chance if wanna get them working if nothing is helping.
 

primetime

Supreme [H]ardness
Joined
Aug 17, 2005
Messages
7,293
If it works some of the time, it's likely a BGA issue of some kind, which I'm not equipped to deal with. :(
you dont have the right tools at home do fix this? Fuck it i would try and Bake it but being very careful
 

DaeviousMax

Weaksauce
Joined
Jun 8, 2021
Messages
125
you dont have the right tools at home do fix this? Fuck it i would try and Bake it but being very careful
Yeah look up tutorials on baking your card in the oven. Do NOT put it in a microwave!!!! I believe Linus tech tips or one of the other YouTube tech-ers has a good video on how to bake the card. The idea is you are heating the BGA solder to the point where any cracks heal. Sometimes it works sometimes it doesn’t. Good luck
 

RazorWind

Supreme [H]ardness
Joined
Feb 11, 2001
Messages
4,271
you dont have the right tools at home do fix this? Fuck it i would try and Bake it but being very careful
I don't have a BGA machine, which is what you need to fix this. Hot air alone won't do it.

Before we suggest anything as barbaric as baking this poor card, we should make sure the OP has exhausted all of the other possible options.

Is it currently overclocked? Does the behavior persist if you underclock the memory and/or core? Is there an updated BIOS available for it? There were some complaints about crashing with early 30 series cards, and one of the fixes involved a BIOS update, as I recall.
 

primetime

Supreme [H]ardness
Joined
Aug 17, 2005
Messages
7,293
I don't have a BGA machine, which is what you need to fix this. Hot air alone won't do it.

Before we suggest anything as barbaric as baking this poor card, we should make sure the OP has exhausted all of the other possible options.

Is it currently overclocked? Does the behavior persist if you underclock the memory and/or core? Is there an updated BIOS available for it? There were some complaints about crashing with early 30 series cards, and one of the fixes involved a BIOS update, as I recall.
perfect advice
 

Andrew_Carr

2[H]4U
Joined
Feb 26, 2005
Messages
2,573
Alright, well I got it running again. Trying a 400MHz underclock on the core to see how it does. [Edit: Nevermind, it crashed after about 15 minutes. Temps were at 70C under load and 60C after ramping up the fan speed.]
 
Last edited:

Shadowarez

Gawd
Joined
Jul 8, 2019
Messages
868
you could attempt northridge fix if your in the states this is the guy to get your card fixed his channel is pure gold on youtube so informative. cant recommend them enough.
 

Andrew_Carr

2[H]4U
Joined
Feb 26, 2005
Messages
2,573
you could attempt northridge fix if your in the states this is the guy to get your card fixed his channel is pure gold on youtube so informative. cant recommend them enough.
Thanks. I was about to ask RazorWind to take it off my hands but I guess he doesn't have a BGA machine. I got a few hours out of it the other day so I'm going to try maybe sticking it next to the AC or something with some extra fans and see if it works one last time.
 

Shadowarez

Gawd
Joined
Jul 8, 2019
Messages
868
i know he has one as i have had a very expensive motherboard fixed. im in canada i needed asus x299 ws sage boards fixed by replacing sockets and bios chips. i feel bad as he is like the only guy i know in NA capable of doing this and he has a really great youtube channel.
 

Shadowarez

Gawd
Joined
Jul 8, 2019
Messages
868
its not bad considering the work involved he replaced sockets and bios chips he made these boards actually reliable again. his first 3090 was a great vid.
 

Andrew_Carr

2[H]4U
Joined
Feb 26, 2005
Messages
2,573
Ok, resurrecting this since I've (knock on wood) hopefully solved things. I sent it to a local repair shop but they didn't do anything useful. Prior to that I was able to get it running for 6 hours straight by putting it in a normal case and trying different orientations. The lower winter temperatures might've helped a bit too. I just spent the last few days monitoring temps while I repadded and repasted about 10 times and it looks like this model of the zotac trinity uses odd 1.25mm or something pads on the front memory (1.5 is too thick, core temps were too high. 1mm is too thin, memory temps were too high). I went through a couple attempts at squishing down 1.5mm pads until I got a decent connection. Then I applied two layers of carbonaut carbon pads so I could get extra thickness on the die and used some spacers on the heatsink screws since I was still getting 80C idle temps prior to that. Now I'm at 67C core and 94C memory temps under load so I think this is good enough. I also left off the backplate so I'm thinking that is preventing PCB flex that could've been causing the other issue (since it seemed to run better in some orientations than others).

Previously I had tried thermal putty since I could never get the thermal pad thickness just right but it looks like memory temps were hitting 110C and throttling so I guess that stuff shouldn't be used on GDDR6x memory even though it works pretty well. I bought some copper foil in 0.1mm thickness so I can layer that and adjust to the perfect height of things if this current configuration ends up not working long-term. But so far it seems stable.
 
Last edited:

Andrew_Carr

2[H]4U
Joined
Feb 26, 2005
Messages
2,573
What did the repair shop actually do?
I don't think they did anything. It looks like they disassembled it and then they told me they couldn't fix it. They said they could do GPU repairs when I asked but it looks like not really. Looking back I don't think the sales/reception people really knew what they were talking about. Even my other GPU I took in where something blew up and should've been at least an obvious component replacement they said they couldn't do anything about.
https://gesrepair.com/
 
Top