Follow along with the video below to see how to install our site as a web app on your home screen.
Note: This feature may not be available in some browsers.
Not going to happen. Too much IP to be exposed there.I'm interested in the detailed process they used for each tests if you can gather some information and pictures. Maybe do a visit ?
That would make a great article
Space Invaders: The journey of a test escape !
IMO, for those reasons you should hate nVidia (or at least be pissed at them). You can like their products, I can respect that. I hate nVidia, but I still recognize their superiority, I just personally will have nothing to do with them regardless.I don't hate Nvidia. I just wish they were a shitload more transparent and gave a damn about their consumers.
Hatred maintained is a slippery slope, my friend. I try not to hate anyone, I can mad as hell at someone or something for a while ... But even that I let go. I try to understand why people and companies do what they do instead. That's how I've come to terms with people that have done some really unforgivable things to me in my life. I can forgive, forgetting is another matter entirely.IMO, for those reasons you should hate nVidia (or at least be pissed at them). You can like their products, I can respect that. I hate nVidia, but I still recognize their superiority, I just personally will have nothing to do with them regardless.
(And yet, despite my lack of respect for them, I still feel it necessary to spell their name the way they used to... Go figure... lol)
Trust me, I realize hate is a strong word and that it may not really be the proper word to use in this context. I also am a forgive-but-not-forget person as you are; however, some people just don't have any desire to change and will continue on the path as they always have (and companies, which are made up of people). Therefore, there are times when forgiveness just isn't deserved, not until they manage to change their ways and continue to do so.Hatred maintained is a slippery slope, my friend. I try not to hate anyone, I can mad as hell at someone or something for a while ... But even that I let go. I try to understand why people and companies do what they do instead. That's how I've come to terms with people that have done some really unforgivable things to me in my life. I can forgive, forgetting is another matter entirely.
Nvidia is a corporation that's run largely unchecked in it's lifetime and it's doing everything it can to remain top dog in an industry that can change in a few short years. While their tactics appear a lot more like what Intel had done to AMD in the past, and they've more or less worked in one form or another, they will not be well received if evidence of their anti consumer practices ever comes to light. It will probably hit their stock value, give them a slap on the wrist and send them back to work. That sounds underwhelming but they won't likely pull this shit again if they're exposed.
The fact that the number one graphics card maker is even resorting to these measures indicates they fear the future. They're trying to crush AMDs graphics division before Intel enters the market.
Say, AMD recaptures 36% of the market share after Navi. So they're back up there and then Intel steals 15-20% of the market from Nvidia... That's why they're trying to place a strangle hold on the market, because they know with more consumer choice, they're going to lose market share, no matter what.
I buy their tech because it's the fastest. Someone else releases a 4K, max resolution graphics card that will put out 60+ FPS in damn near all games, I will buy it. Until then, I only have one choice.
I may despise their business practices but if I've learned one thing in the IT industry it's that corporate ethics are pretty damn hard to come by. I have a business degree and an IT degree and the business one taught me all about good business ethics... None of which I have seen, in damn near any form, in corporate America...
No.Does anyone know if Nvidia has mostly resolve this issue with current manufacturing? Or is this problem going on at the same rate or slightly diminished?
I wonder if it was related to TSMC chemical issues earlier reported that caused a number of issues with 12nm-14nm dies. This should really be a huge news story that keeps being updated but the quietness of it all is actually rather disturbing. 20% failure rate is way out there for a product especially if these GPU's are used for more than gaming.
Kudos to [H] and Kyle for doing this.
What is disturbing to me is the way Nvidia is handling this failure. Kyle mentioned some quiet, backchannel, comms from AIBs. That indicates the AIBs are reticent to speak out. Obviously, the source of that reticence is the fear of Nvidia's response.
This aligns with GPP and Nvidia's other market practices. Unfortunately, none of these reflect well on Nvidia.
Manufacturing errors do occur. With GPUs and their billions of transistors melded into a complex working whole, it's amazing they work as well as they do, as often as they do. Imagine if Nvidia had taken an open approach, stopped selling their cards, put out a reward for anyone discovering the source of the problem, etc. In other words, an open and frank admission. What a difference.
Keeping OT, this failure mode and the failure rate are very interesting. Obviously, Nvidia did not mean for this to occur... How did such an experienced manufacturer fall into this error?
Probably tried to cut cost somewhere. In a different industry I have seen the results before of a company lowering the cost of a chip 1-2 cents causing a board to suddenly have a 15% failure rate in the field after some time due to the chips failing. I don't know what they changed to make them a tad cheaper but it caused a lot of problems that is for sure. So it could be something as simple as they decided that since their profits were starting to slip they would try to save a few cents here or few cents there and slightly changed a few things that is resulting in the issues.
Not sure I’d buy that given how over engineered the VRM was and general high fit and finish. There were plenty of places they could have skimped first that wouldn’t carry the risk. I would more believe an outsourced assembly or supplier cutting corners and not telling them.
I do agree their heavy handed BS to silence partners is no bueno though and respect the work that is going into this
Wouldn't explain custom cards though. Custom designed cards use different power circuitry than the stock board. The defect has to be something that doesn't change between stock and custom cards.
That would only leave the GPU die which isn’t fabbed by NVIDIA?
That and the memory. There are likely other minor chips and parts shared between stock and custom cards, but not sure if they'd be causing the issue.
Memory chips are picked by AIBs as long as they meet the spec
Could be that nvidia engineers decided to ignore or misinterpreted some parts of the gddr6 spec, which could cause issues down the road (if not immediately). I kinda doubt that, but if it was their opengl team I wouldn't be surprised.Only two companies make GDDR6 so Nvidia and AIBs both use Micron and Samsung chips. That said, given that this happens across both manufacturers, it is unlikely the culprit.
The problem with their cards is in the silicon, their GPU silicon.
...
We may never see Nvidia called out on their bullshit, however, does it matter anymore? We already know there's something deeply flawed somewhere in their silicon. Calling them out on it won't break their company, it will only make them get nastier on their next set of programs that limit what anyone can say about their products.
Given that the problem is most likely in the GPU silicone itself, do we actually know that it is Nvidia? TSMC has had issues with their smaller nm processes, how do we know that this problem is ultimately with TSMC?
I don't buy cards from TSMC. If there is a problem in he silicone it is up to NVIDIA to fix it/catch it and not sell me something they know is broken.
The issues at TSMC have been isolated. Until it is explicitly stated that TSMC has screwed up a bunch of Nvidia's silicon, it's going to be Nvidia's issue. What I think will happen if someone calls Nvidia out on this is that they will simply blame TSMC. They're holding that card in their back pocket.Given that the problem is most likely in the GPU silicone itself, do we actually know that it is Nvidia? TSMC has had issues with their smaller nm processes, how do we know that this problem is ultimately with TSMC?
I think the big argument here is about protecting the consumer, which Nvidia totally doesn't give a rats ass about. The 20 series cards saw a price hike, a massive one at that. There is zero transparency in that company. They have a history of trying to screw their consumer base. I get what you're saying, that once the wheel is in motion, it's hard to stop. Nvidia acknowledging that there was an issue would have been the ethical thing to do. However, they didn't do that. They flat out lied about the frequency of issues by calling them test escapes and it was some absurd .001 percent of their cards that had the issues. When, in fact, it was more like 20-25% of all of their 20 series products. We have seen 2070's, 2080's, 2080Ti's, RTX Titans all fail hard. I don't have any figures about the 2060's... However, it's likely an issue there as well. The 1660 and 1660Ti are derived form the same silicon and may very well suffer the same issues...Unless they did not know it was broken. Sometimes errors only pop up when production is ramped up. By that time all the chips are already in motion...literally. That isn't to say that Nvidia doesn't share the responsibility, but it is hard to blame Nvidia for not fixing a problem they are not directly involved with, and what I quoted was a claim that it was specifically Nvidia's design that was at fault. I was merely suggesting it may not be the design, it may be the manufacturing process.
I think the big argument here is about protecting the consumer, which Nvidia totally doesn't give a rats ass about. The 20 series cards saw a price hike, a massive one at that. There is zero transparency in that company. They have a history of trying to screw their consumer base. I get what you're saying, that once the wheel is in motion, it's hard to stop. Nvidia acknowledging that there was an issue would have been the ethical thing to do. However, they didn't do that. They flat out lied about the frequency of issues by calling them test escapes and it was some absurd .001 percent of their cards that had the issues. When, in fact, it was more like 20-25% of all of their 20 series products. We have seen 2070's, 2080's, 2080Ti's, RTX Titans all fail hard. I don't have any figures about the 2060's... However, it's likely an issue there as well. The 1660 and 1660Ti are derived form the same silicon and may very well suffer the same issues...
The right call would have been to announce the problem, support their user base, correct the issue and then inform the user base of the batches of cards moving forward that don't have an issue anymore. This would have hit their stock prices, people would have stopped buying their cards and their stock which had already tanked would have been reduced to a dumpster fire. The discovery TSMC made hadn't yet aired into public knowledge and that is why Nvidia lied and remained silent.
If they came out now, it would look even worse, like corporate negligence or flat out lying to their consumer base. Furthermore, coming out now would likely ignite a class action lawsuit against them.
The fact that Nvidia hasn't even broached the subject of TSMC's contaminated batches of silicon is one of two things. As above I said it's a card to save in their back pocket in case they do get called out.... And/or it's because TSMC's bad batches have nothing to do with the problem.
There is no truth in unsubstantiated rumors.
Hard to believe when people (and reviewers) are complaining about receiving a replacement card defective too.There are a lot of murmurings about increased failure of the card, but the return rates are only at 3.5% with half being normal returns. So you have a failure rate if we are being generous of 2%. Do you shut everything down and stop selling the product because of a 2% failure rate?
There are a lot of murmurings about increased failure of the card, but the return rates are only at 3.5% with half being normal returns. So you have a failure rate if we are being generous of 2%. Do you shut everything down and stop selling the product because of a 2% failure rate? Do you panic before you find the cause of some of these recurring and similar failures? Tell me, what do you do?
My failure rate, alone, with the 2080Ti is 66.666666666666666666666666666666% LolLOL WTF?? 3.5% failure rate, where are you plucking that figure from? Nvidia?
It was Kyle who said that AIBs have reported to him that they are experiencing an RMA rate higher than 20%. It would only take a quick look around the various tech forums to see that the figure is much more likely to be 20% than 2%.
And if you trust and have faith in Kyle's ability and willingness to get to bottom of this issue then you have to believe his reporting of the returns rate is accurate.
After my unpleasant experience of a dying RTX 2080ti FTW3 few days ago, I will provide here the only logical explanation about this RTX mess.
Facts:
- all models from all brands are affected: this problem can impact any RTX card
- different PCB with oversized power supply and cooling are also affected: so it's not a problem related to temperature or power failure
- cards can die with Micron memory (used from the beginning so widespread) but also with Samsung memory (used later): so the problem is not related to the memory, otherwise it would mean both Samsung and Micron are unable to manufacture GDDR6 (moreover without giving any explanation on other types of death with no artifacts)
https://forums.geforce.com/default/...-evga-2080ti-ftw3-icx2-hydro-copper-error-43/
- we can see different type of death with different symptoms: it's hard to believe that this RTX generation is cursed to the point of undergoing several types of problems
- the core itself remains the only common point between all these dying cards: knowing that some cards are still working fine, then it's not a faulty design of the Turing architecture from Nvidia
- The survival time is variable: this implies abnormal and progressive deterioration of the core over time (not detectable at factory)
- we learn at the same time that TSMC 12nm fab who builds RTX has suffered from a serious contamination: but little information provided to not scare away customers and investors
https://www.extremetech.com/computi...stroys-tens-of-thousands-of-nvidia-gpu-wafers
- a contamination of the core can appear in many ways: when the memory controller is affected (inside the core) then we can see some artifacts (corrupted cache), if the card has light contamination level then it can survive longer but with strange behavior (power drop, freezes, crashs, temperature spikes...) and if the core is seriously damaged then Windows shows blue screen with core dump and card not detected by device manager!
LOL WTF?? 3.5% failure rate, where are you plucking that figure from? Nvidia?
It was Kyle who said that AIBs have reported to him that they are experiencing an RMA rate higher than 20%. It would only take a quick look around the various tech forums to see that the figure is much more likely to be 20% than 2%.
And if you trust and have faith in Kyle's ability and willingness to get to bottom of this issue then you have to believe his reporting of the returns rate is accurate.
God damn you use a lot of fucking words to say nothing of value.
Anywho, yes there can be truth in an unsubstantiated rumor. In this example it’s only unsubstantiated because the hypothetical evidence might not be possible to release. In other examples rumors can be true from undisclosed sources.
The only thing that can be done is for the truth to come out.3.5% is the highest reported figure I have seen from either Nvidia or resellers. It may be the numbers are higher, but they are not being reported. What is being reported has nothing to do with my faith in Kyle's ability or his conclusion. Also my original premise that it may have been something that slipped through also seems to be a thought shared by Nvidia, as early as November of last year. And it seems mostly to deal with Ti cards. Even if we take the figure of 20%, the same logic applies though.
I see no one has chosen to answer what should be done?
No, by definition, there cannot.
The failures afaik are on 2070's, 2080's, 2080Ti's, RTX Titans and I don't have any figures on the 2060's but suspect they are suffering in similar fashion . The 1660 and 1660Ti are made from the same silicon... I don't have figures on those recent boards tho.I think this is it as well. The GPU silicon. I was down to either the GPU silicon, or the PCB board, but since custom AIB's are also seeing the issue, which I believe use custom their own PCB designs, all that is left is the GPU itself.
As far as the affected RTX cards, is it still just the 2080Ti? Or is the 2080, 2070 etc also lumped into this? For them to be included, they should be at the same 20% failure rate. I think up to 5% is fairly typical, beyond that would call it a high rate of failure.
I thought that the silicon of the 2080 vs the 2080Ti were different? https://en.wikipedia.org/wiki/GeForce_20_series They have differing transistor counts by the billions, as well as different sizes.
So it could be a TSMC issue with the production, or a design flaw with the GPU (seems less likely). If the issue affects all three RTX dies (TU102, TU104, TU106), then design flaw becomes much more probable. If it's a manufacturing issue, then it's just one more nail in Moore's Law's coffin, one more issue to be resolved before things move forward...
As far as nVidia not saying anything more than what they already have, they have to protect relationships with TSMC, et all. Plus, they don't want to scare away any sales. If 80% of the cards are good, and end up having a normal lifespan >3years, at least the customer/enthusiast isn't screwed over... I haven't heard that nVidia or any AIB's are not supporting their customers having these issues. Of course it sucks to have to deal with it, sucks to get multiple failed cards in a row, sucks to pay return shipping, and it sucks they were so expensive.
I'm just going to wait for the 7nm GPU's before I upgrade from my 1080Ti.
If nVidia came out and said "we are going to double the warranty on the 2080Ti cards", that would be well received by the community. Of course, if EVGA still has lifetime warranty, this is already in place at least for some... does anyone still do lifetime warranties? Maybe I am thinking of old BFG...
Still hoping that there is some kind of conclusion to the testing Kyle is doing, but if its the silicon, not sure that any amount of testing will reveal that, short of swapping the GPU out with a card that has been running for months with no issues. Expensive, and not very easy to do. I wonder if Louis Rossman could swap these giant ass GPU's...
Appreciate all the work you've put into this Kyle.