TSMC Nanke 14 Factory Production Interruption Could Affect NVIDIA and Others

cageymaru

Fully [H]
Joined
Apr 10, 2003
Messages
22,060
According to reports, the TSMC Nanke 14 Factory has experienced a production shutdown after substandard chemicals used in the manufacturing process ruined tens of thousands of wafers. These defects in the silicon wafers are not able to be detected until after the production run. Companies affected include industry heavyweights such as NVIDIA, MediaTek, Huawei Hisilicon, and some ARM server processors. The 16/12nm process is one of TSMC's main sources of revenue. TSMC does not know the financial impact of the loss at this time, but it is expected to be extremely high due to the advanced technology that uses these chips such as NVIDIA GPUs.

More detailed information said that the wafer contamination incident occurred at the Fab 14 fab of Nanke Technology Park. This fab was also one of the affected factories in the virus incident last year. Wafer manufacturing is a very demanding process that requires the use of a wide variety of chemical materials and requires high purity. This accident stems from the fact that imported chemical materials do not meet the requirements, resulting in flaws in the wafers produced.
 
that's an oof.

guess the vega 2 / radeon 7 is going to be the buy for high end? either that or pay double that for a 2080 ti
 
An explanation for 2080ti failures???!??
That would be something wouldn't it... I almost liked believing Nvidia was releasing a flawed design... However, this means that they might have been on the up and up . Wonder if we're all effected... Good God, that would mean everything they produced until now is likely effected.
 
Not good for NVIDIA no matter what; stock currently down 17% just of the earnings miss (thanks largely to slowdown in China).
 
Poor Jensen. Will have to return his gold toilet seat and exchange it for a plastic one.

Wouldn’t this be funny and what’s causing all 20xx series to be dieing at rapid rate?

Which stupid Jensen should have mass recalled months ago yet he keeps making suppliers stock this shit from him to replace ten series to rip customers off and not give you support when rip off rtx on cards die??


Time for a class action to make Jensen lose all of his stock value.

Nvidia was a ticking time bomb it’s finally time the tables turn on them. I will never buy shit from them again or intel. I’ve been using PCs since 8086k days.

Next build and or upgrade will be Ryzen 3xxx and the new ati video card. I don’t care if I get 20 less FPS in a game above 100fps.

I’ll save money and have quality hardware unlike the lifeless intel/nvidia money scam rip off broken crap.
 
Not good for NVIDIA no matter what; stock currently down 17% just of the earnings miss (thanks largely to slowdown in China).

We see how bad Nvidia's plan was, as it affects the gaming and data-center businesses as well as the China thing. Now I wonder how bad AMD did with their plan and how hard China will hit them: Tuesday afternoon will tell.
 
snip

Nvidia was a ticking time bomb it’s finally time the tables turn on them. I will never buy shit from them again or intel. I’ve been using PCs since 8086k days.

Next build and or upgrade will be Ryzen 3 and the new anti video card. I don’t care if I get 20 less FPS in a game above 100fps.

I’ll save money and have quality hardware unlike the lifeless intel/nvidia money scam rip off broken crap.

You know I was thinking the same thing not to long ago and ended up buying a used 1080 anyway, no matter how hard you try the dark side will get you eventually.
 
I wonder where they imported (according to Google translate) the bad chemicals from?
No surprise, though: it's a common story in China for a supplier to provide materials that don't meet spec.
I had a scuba BCD (buoyancy control device) recalled because a Chinese subcon swapped non-marine-grade steel for the marine-grade steel sped'd for the springs int he BCD's valves, leading to the springs rapidly corroding in salt water and failing catastrophically.
 
Poor Jensen. Will have to return his gold toilet seat and exchange it for a plastic one.

Wouldn’t this be funny and what’s causing all 20xx series to be dieing at rapid rate?

Which stupid Jensen should have mass recalled months ago yet he keeps making suppliers stock this shit from him to replace ten series to rip customers off and not give you support when rip off rtx on cards die??


Time for a class action to make Jensen lose all of his stock value.

Nvidia was a ticking time bomb it’s finally time the tables turn on them. I will never buy shit from them again or intel. I’ve been using PCs since 8086k days.

Next build and or upgrade will be Ryzen 3xxx and the new ati video card. I don’t care if I get 20 less FPS in a game above 100fps.

I’ll save money and have quality hardware unlike the lifeless intel/nvidia money scam rip off broken crap.
You realize that this would place the sole blame on TSMC rather than Nvidia, right? Even if there is a, real, flaw in the silicon it will be swept under the rug because of this. Stock market hits and this issue will not destroy the largest and most recognized graphics company in the world. This is a minor setback for them. It's just like the process node that Intel dropped the ball on is a relatively minor set back for Intel. Both companies will lose some significant market share, to AMD and other competitors. Its' not going to kill them tho.

There is no reason, at this point, why Nvidia couldn't simply shift production to Samsung or Global Foundries (GF's 12nm might not be as advanced).
 
I wonder what the chances this was some form of industrial sabotage are? It seems strange that the chemicals needed would be tainted like that.
 
You realize that this would place the sole blame on TSMC rather than Nvidia, right? Even if there is a, real, flaw in the silicon it will be swept under the rug because of this. Stock market hits and this issue will not destroy the largest and most recognized graphics company in the world. This is a minor setback for them. It's just like the process node that Intel dropped the ball on is a relatively minor set back for Intel. Both companies will lose some significant market share, to AMD and other competitors. Its' not going to kill them tho.

There is no reason, at this point, why Nvidia couldn't simply shift production to Samsung or Global Foundries (GF's 12nm might not be as advanced).

I would expect Nvidia to do qualification and QA on the silicon produced by TSMC. What are the chances that Nvidia detected the issues, and released products anyway...
 
This could very well explain the substandard chips and crazy high failure count. Outside companies (looking at CHINESE ONES) slipping in some sub standard chemicals crucial to the wafer process may initially make something viable in initial testing but cause the chips to degrade when under heat/voltage load of real world long term use. I mean they did this with baby formula.... what more for something like this which is not as scrutinized.

The failures we've been seeing are real outright massive chip failures. Even my dying 470 card functioned for years in my wife's PC in most applications and day to day, and occasionally 1/100 uses triggered a random reboot or display driver crash for years. That gpu died a true heat death. The way my 2080ti was a sudden and outright death. I'm talking was not POSTING within 15 minutes of the first crash that happened.
 
We see how bad Nvidia's plan was, as it affects the gaming and data-center businesses as well as the China thing. Now I wonder how bad AMD did with their plan and how hard China will hit them: Tuesday afternoon will tell.

To be fair, it's not just China. Caterpillar just had it's biggest miss since 2008, and pretty much all the industries that correlate to growth missed their targets. We're probably a quarter or two away from a proper recession starting.
 
Nvidia was a ticking time bomb it’s finally time the tables turn on them. I will never buy shit from them again or intel. I’ve been using PCs since 8086k days.
Commendable, but will you still feel the same way in the future should AMD not be competitive with Nvidia for GPUs or Intel for CPUs?

Personally, I buy whichever product suits my needs best, not which company I like best.
 
tens of thousands of wafers.. ouch.

wonder if they will repurpose them as decorations.

have always wanted an actual wafer to check out
 
  • Like
Reactions: N4CR
like this
tens of thousands of wafers.. ouch.

wonder if they will repurpose them as decorations.

have always wanted an actual wafer to check out
I wonder if they detected it because the silicon was already generating space invaders, in real time, unpowered... Lol .:LOL:
 
Curious - is [H] still looking at (or having them looked at) those 2080[ti] cards that died? It would make sense if this was the root cause... but kinda amazing that if it IS the root cause that it took this long to even find out or report? I mean... holy crap. It's not as bad as VW fudging emissions to get cars shipped, right?
 
Curious - is [H] still looking at (or having them looked at) those 2080[ti] cards that died? It would make sense if this was the root cause... but kinda amazing that if it IS the root cause that it took this long to even find out or report? I mean... holy crap. It's not as bad as VW fudging emissions to get cars shipped, right?
https://hardforum.com/threads/rtx-space-invaders-wanted.1973631/page-3#post-1044047340

189337_space_invaders.jpg

Our Space Invaders card is undergoing thermal testing this weekend. Monday we will start memory and memory infrastructure testing. Getting thermal out of the way seemed like a good place to start.
 
Anyone actually able to read the articles? I open them up, and it's all in Chinese.
 
RTX 2080 is 12nm and this shutdown appears to only affect 14nm, so wouldn't this just affect Tegra chips (eg. Nvidia Shield or Nintendo Switch) ???
The quote indicates that fab facility makes anything from 16nm down to 12nm
"According to reports, the TSMC Nanke 14 Factory has experienced a production shutdown after substandard chemicals used in the manufacturing process ruined tens of thousands of wafers. These defects in the silicon wafers are not able to be detected until after the production run. Companies affected include industry heavyweights such as NVIDIA, MediaTek, Huawei Hisilicon, and some ARM server processors. The 16/12nm process is one of TSMC's main sources of revenue. TSMC does not know the financial impact of the loss at this time, but it is expected to be extremely high due to the advanced technology that uses these chips such as NVIDIA GPUs."
 
RTX 2080 is 12nm and this shutdown appears to only affect 14nm, so wouldn't this just affect Tegra chips (eg. Nvidia Shield or Nintendo Switch) ???
Yes, I know. I was answering the first questions he specifically asked. "Curious - is [H] still looking at (or having them looked at) those 2080[ti] cards that died? "
 
An explanation for 2080ti failures???!??

Unless this has been going on for months and was just discovered no. The failing cards people were encountering shortly after release were probably fabbed half a year or so back between initial manufacturing for the release day stock pile and using cheap (and slow) maritime shipping instead of air mail to distribute them to the rest of the world. Also, if this was to blame for the early 20xx failures we should be seeing a similar rash of failures from all of the fabs other customers; but nothing else has shown up in the news.
 
Hmmm, I was more expecting a fire at Micron or Samsung’s DRAM production line, when I scanned the headline...

But I’m sure that won’t be many more months...
 
  • Like
Reactions: N4CR
like this
Not strange at all if they bought them from China. It's a nightmare dealing with chemical suppliers who source things in China.
I would normally agree but the PRC has a few state sponsored fab's that are just recently starting up in the 14nm range, but they are having trouble convincing Chinese companies to switch away from TSMC and to use the local fabs instead. A huge slowdown would maybe cause a few of them to switch over, and this is likely going to be used as a local propaganda piece on why building in China is better. Even if they weren't sourcing their chemicals from China, how hard would it be for them to slip in a tainted drum of two into the shipments?
 
you'd think with critical chemical components they'd test each batch and make sure its up to spec.

I know right.. Where I work, they are required to test each batch of material used in production to make sure it meets spec... even if it is from the same exact supplier with the same exact part number.

Sounds like they need to have a testing lab onsite to test everything instead of just blindly trusting what they paid for is what they received.
 
Ran into this when I worked for a memory manufacturer - we had polonium contamination in our phosphoric acid causing occasional stray gamma rays that would damage thin oxides - so it'd wreck an occasional capacitor in our dram. It was such a tiny amount that it'd basically knock out one or two cells in ten thousand chips (which had 8 million cells each) - the worst kind of needle in a haystack you've ever tried to find. We finally traced it back by segregating the line and sending lots partially processed at our other locations except for certain steps on our line only - to where we finally narrowed it down to the wets area and then to the phosphoric acid steps early on in the process. Problem was you'd have to wait for the polonium to degrade so it'd take a few weeks for any cells to fail. I figured out that heat accelerated it a little (before we knew what caused it) so we'd test the wafers, then stick the lots in a high temp oven for a week and then re-measure. Cost us tens of millions and took almost 6 months to figure out, but luckily for us only affected one production line and didn't affect graphics ram (due to the high speed nature of it, it didn't need as much capacitance). Even when we traced it back to the phosphoric acid, we couldn't figure out why it was a problem right away as we had the same supplier in all of our fabs. But, it turned out that for our US fab they had phosphorus mined in a different region than the EU and Asian fabs, and that was the difference. We're talking completely undectable amounts of contamination, though we did figure out later how to detect it.

It wouldn't surprise me that they're dealing with something similar, we're talking indetectable amounts of contamination that would only show up after long periods of time over thousands of chips, and at a very slightly elevated levels compared to background radiation- but enough for customers to see a slightly elevated failure rate. I won't go into everything we tried to isolate the issue, but one of them involved flying wafers on coss continental flights to expose them to higher levels of outer atmosphere radiation to see if it accelerated the issue, thinking possibly something about the way the chips were manufactured were more susceptible to background radiation. Turned out to be wrong, but we had the right idea in that radiation was causing it.
 
Last edited:
I wonder what the chances this was some form of industrial sabotage are? It seems strange that the chemicals needed would be tainted like that.

It's almost difficult to believe they don't test their chemicals for identity and purity before use.

In biopharma itis a regulatory requirement to as part of incoming inspection to do identity and purity testing in a QC lab using some sort of HPLC or LCMS method.

Sure these things cost money, but they are WAAAAAAY cheaper than losing several lots of silicon.
 
Ran into this when I worked for a memory manufacturer - we had polonium contamination in our phosphoric acid causing occasional stray gamma rays that would damage thin oxides - so it'd wreck an occasional capacitor in our dram. It was such a tiny amount that it'd basically knock out one or two cells in ten thousand chips (which had 8 million cells each) - the worst kind of needle in a haystack you've ever tried to find. We finally traced it back by segregating the line and sending lots partially processed at our other locations except for certain steps on our line only - to where we finally narrowed it down to the wets area and then to the phosphoric acid steps early on in the process. Problem was you'd have to wait for the polonium to degrade so it'd take a few weeks for any cells to fail. I figured out that heat accelerated it a little (before we knew what caused it) so we'd test the wafers, then stick the lots in a high temp oven for a week and then re-measure. Cost us tens of millions and took almost 6 months to figure out, but luckily for us only affected one production line and didn't affect graphics ram (due to the high speed nature of it, it didn't need as much capacitance). Even when we traced it back to the phosphoric acid, we couldn't figure out why it was a problem right away as we had the same supplier in all of our fabs. But, it turned out that for our US fab they had phosphorus mined in a different region than the EU and Asian fabs, and that was the difference. We're talking completely undectable amounts of contamination, though we did figure out later how to detect it.

It wouldn't surprise me that they're dealing with something similar, we're talking indetectable amounts of contamination that would only show up after long periods of time over thousands of chips, and at a very slightly elevated levels compared to background radiation- but enough for customers to see a slightly elevated failure rate. I won't go into everything we tried to isolate the issue, but one of them involved flying wafers on coss continental flights to expose them to higher levels of outer atmosphere radiation to see if it accelerated the issue, thinking possibly something about the way the chips were manufactured were more susceptible to background radiation. Turned out to be wrong, but we had the right idea in that radiation was causing it.

Perhaps not intentional, but with the chemicals involved and volume, this is entirely a plausible scenario which is just hard luck. When examining the 2080ti GPU chip, its a monster and would be more likely to fail compared to other chips manufactured at roughly the same time. This also leads into the idea that is extremely difficult to trace the cause of the failure even given the number of failures out there if this was the case. Now this can all be bunk and nvidia just designed a flawed chip but I'd think we'd see something consistent on how to kill the cards if that flaw was the case.
 
Back
Top