Dell on The Capacitor Issue

HardOCP News

[H] News
Joined
Dec 31, 1969
Messages
0
Responding to the recent articles about Dell’s faulty capacitor issues back in 2003 – 2005, a company spokesman pointed out that it was an industry wide problem and Dell never intentionally shipped faulty motherboards.

As noted in a New York Times article about the lawsuit, faulty capacitors were manufactured by Nichicon, a respected, long-term supplier to many industries. These capacitors were used by Dell suppliers at certain times from 2003 to 2005. The faulty Nichicon capacitors affected many manufacturers, including Hewlett-Packard, Apple and others, as discussed in the initial story and several blog posts afterward. Again, this was an industry-wide problem.
 
I wish i had kept track of how many motherboards i have sent back to dell because of bad caps. Were running out of decommissioned gx280's to use for spare parts, and it feels like were still getting about 5-10 gx620's a week that wont boot due to caps or bad psu's

Thank you dell, for my continued job security!
 
Pretty sure I said this very problem in another thread, just before this. I fix a lot of Dells with bad motherboards, and it becomes a bigger problem when the motherboard wasn't ATX, and had the OS tied to the board as well.
 
"faulty cap's" is just marketspeak for:

"During the design phase we set silly tight volume and price margins. The design engineers cut as close to the MAXIMUM, run at this level and lifetime shortened, level of voltage, ripple current across operating temperature. This means there is a sharp rise in failures. But we can't say its a design issue, we put it downto a sourcing issue"


This is the problem with non-safety critical hardware designs, they push things sooo close they they will fail! I have to design things at an ambient of 90C (so components are de-rated from 75C) and each component cannot be stressed for anything greater than 90%.

I have put caps down that are too close to the absolute rating and damn! they fail. NO SHIT! you push something outside its operating range THEY will fail!
 
I wish i had kept track of how many motherboards i have sent back to dell because of bad caps. Were running out of decommissioned gx280's to use for spare parts, and it feels like were still getting about 5-10 gx620's a week that wont boot due to caps or bad psu's

Thank you dell, for my continued job security!

So replace the caps themselves.. not hard to do. Takes about 30 minutes to do a whole motherboard by hand.

I'm not discounting the fact that it is a sucky issue.... but if you are still running GX260s, 270s, or 280s, or any other P4 based or older machines, then the company must not be too worried about efficiency or the amout of work being done by employees.
 
"faulty cap's" is just marketspeak for:

"During the design phase we set silly tight volume and price margins. The design engineers cut as close to the MAXIMUM, run at this level and lifetime shortened, level of voltage, ripple current across operating temperature. This means there is a sharp rise in failures. But we can't say its a design issue, we put it downto a sourcing issue"


This is the problem with non-safety critical hardware designs, they push things sooo close they they will fail! I have to design things at an ambient of 90C (so components are de-rated from 75C) and each component cannot be stressed for anything greater than 90%.

I have put caps down that are too close to the absolute rating and damn! they fail. NO SHIT! you push something outside its operating range THEY will fail!

Just about every OEM and aftermarked manufacturer had issues with caps around that time period. Maybe it was the toghtwaddedness of the companies and maybe it was mostly the fault of the crappy cap manufacturer over rating their product. Kind of how crappy PSU manufacturers way overrate the capacity of their super crappy PSUa.

I've even had P4 boards made by Intel that have had the cap issue... in fact, I have one sitting in my computer room right now. I Have an old P4 ASUS board that has the cap issue. A lot of Slot-A boards had the issue.

I do get that it was a crappy thing to happen. In fact, a few years ago, I was working as as an onsite Dell tech. The 4-5 of us that worked for the same branch of the warranty company in Tucson, AZ replaced thousands of gx260-280s motherboards under warranty.
 
When I worked at intel, we had tons of SX280s with blown caps. The caps would be leaking. It was a pretty obvious problem. Intel buys extended warranties on their systems so it didnt matter that much
 
I had 2 boards with caps that blew. One was Biostar the other was Soltek. Switched to Abit after that and that board is still running as I speak - overclocked too.
 
Dell has been really good about replacing those motherboards. We have some of these ancient things running in front proprietary equipment and they still fry and Dell still sends us new boards. I'm not a fan of Dell stuff at all but I can't say anything bad about how they've supported us with these things, even when we wouldn't think they would.
 
That's why I went with a Gigabyte Ultra durable. Solid caps don't leak. No..no, they burn.:D
 
If you look up the Wall Street Journal article that ran on this issue you will see where they state that Dell knowingly, KNOWINGLY shipped 11.8 million bad computers. They commissioned an outside contractor to look into the issue and the contractor even told them that the boards were faulty and they KEPT SHIPPING THEM.

I'm typing this on a Dell. I have 5 other Dells in the house. I won't buy another Dell and I won't make any further professional recommendations for them.

Hello Acer.
 
As another poster already mentioned. A lot of manufactures were having capacitor issues during this time frame. I don't remember model numbers but Gateway was another big company having known issues. And this wasn't restricted to Intel boards. Sounds to me like Dell needs a better document retention policy. Don't keep documents laying around that can burn you.
 
This is a really famous issue in the annals of computers. There's even a Wiki:

http://en.wikipedia.org/wiki/Capacitor_plague

I didn't read the whole thing, but I don't think it goes into what really kicked this whole thing off. It goes into what happened, but not why. Probably the best point of view for the Wiki author to take.

As i remember reading it years ago, it started with the theft of a Capacitor formula by a guy, and when he cloned it he missed some sort of "Secret Sauce" that was in the original but he didn't include. He then proceeded to sell billions of the defective little bastaads before the issue was found, as it took time for the caps to destabilize and present the problem.
 
We had IBM 300XL's dropping like flies after about 2 years of age.

And ABIT I think was the first MB manufacturer to really have to deal with it in large scale. IIRC they took care of me pretty darned good though.
 
They may not have knowingly shipped bad MBs that I don't think is the issue at heart here I think the real issue is after they found out, after INTEL pulled the boards Dell did not. You could say I have bad caps, or bulging caps and they'd fix it... if it was in warranty. You could however not say, I have these xyz gx270's and 280's I know the boards have been known to fail so lets setup a swap out. No the would only handle it case by case in warranty only. That would be the issue.
 
So replace the caps themselves.. not hard to do. Takes about 30 minutes to do a whole motherboard by hand.

I'm not discounting the fact that it is a sucky issue.... but if you are still running GX260s, 270s, or 280s, or any other P4 based or older machines, then the company must not be too worried about efficiency or the amout of work being done by employees.

Depends on the work of the end-users. A P4 is fine for most callcenter desktops.
 
That's a grey area that mfr's absolutely hate. They are not fond of replacing things pro-actively unless it's a safety issue. That's why, I think, they are much more willing to issue a Battery or A/C adapter recall as the liability could be immense. In some cases that's warranted. In this one, given it was a known issue with known components that were known were going to fail eventually, it should have been taken care of and Dell should have been able to get reimbursed themselves I would think.



They may not have knowingly shipped bad MBs that I don't think is the issue at heart here I think the real issue is after they found out, after INTEL pulled the boards Dell did not. You could say I have bad caps, or bulging caps and they'd fix it... if it was in warranty. You could however not say, I have these xyz gx270's and 280's I know the boards have been known to fail so lets setup a swap out. No the would only handle it case by case in warranty only. That would be the issue.
 
If you look up the Wall Street Journal article that ran on this issue you will see where they state that Dell knowingly, KNOWINGLY shipped 11.8 million bad computers. They commissioned an outside contractor to look into the issue and the contractor even told them that the boards were faulty and they KEPT SHIPPING THEM.

I'm typing this on a Dell. I have 5 other Dells in the house. I won't buy another Dell and I won't make any further professional recommendations for them.

Hello Acer.

Wrong on all accounts.
Of course they kept shipping them. They problems didn't manifest right away, it took even more time for enough problems to be reported for Dell to notice a trend. Then Dell began the internal investigations to find the source....ect. Did you expect all operations to halt just because a few boards came back faulty.

I notice you fail to mention that Dell could not identify which motherboards contained Nichicon capacitors, nor whether the Nichicon capacitors would fail.

The Nichicon capacitors were used between 2003 and 2005. Dell worked with customers to fix OptiPlex motherboards as they failed, and Dell extended the warranties on all OptiPlex motherboards to January 2008 in order to address the Nichicon capacitor problem. Failure rates were highly variable depending on the customer’s environment and the number of Nichicon capacitors in the motherboards.

Dell didn't knowingly ship computers with bad caps. They didn't have a reliable way to detect which parts were going to fail. the situation wasn't quite as dire as the media claims, that how media works.
If the headline stated "Dell is working with accounts to replace faulty equipment" it wouldn't have gotten a second glance.
 
still running GX260s, 270s, or 280s, or any other P4 based or older machines, then the company must not be too worried about efficiency or the amout of work being done by employees.


I just left a company with tons and tons of old GX260's a few weeks ago. It was funny how the machines would get worse and worse and then suddenly stop working.

You might think that people would be annoyed, but they were not. They were happy they were failing, cause then there was a remote chance they may get a more modern computer.

After they failed it was a little bit of a lottery to see if they would yank out the HD and put it in a spare GX260 for you and continue the misery for another 6 months until it failed again, or if they were out of spares and would get you a new machine.

It was a company with many great people, and absolutely miserable and cheap management. I'm glad I left.
 
Dell didn't knowingly ship computers with bad caps. They didn't have a reliable way to detect which parts were going to fail. the situation wasn't quite as dire as the media claims, that how media works.
If the headline stated "Dell is working with accounts to replace faulty equipment" it wouldn't have gotten a second glance.

Which is really surprising, as Dell is the textbook/case study example at many business schools of how to proerly use high tech MRP and distribution systems to control inventory lots and know exactly what goes where efficiently...
 
Depends on the work of the end-users. A P4 is fine for most callcenter desktops.

You shure about that? Almost every time I am forced to call into a call center, the person on the other end usually says that the systems are going extremely slow today.
 
You shure about that? Almost every time I am forced to call into a call center, the person on the other end usually says that the systems are going extremely slow today.

its probably a network issue rather than a computer issue.

heck, for the kind of stuff the call centers are doing a 286 with a terminal emulator ought to be sufficient...
 
Zarathustra[H];1035900725 said:
Which is really surprising, as Dell is the textbook/case study example at many business schools of how to proerly use high tech MRP and distribution systems to control inventory lots and know exactly what goes where efficiently...

You have a point, I don't know anything about business school. However I am very familiar with Dell's inventory and dispatching system. I don't want to go into detail about it since I like my job.
Small batches of parts reduce overhead, that doesn't quite apply for many OptiPlex systems, the systems are built in bulk and shipped out.

The problem with tracking the failed components falls on the fact that it was sporadic failures and not every failure was a motherboard cap. Even worse the failures were occurring on different platforms. Some were powersupply failures which Dell doesn't analyze on a regular basis. Eventually Dell got the memo and sent notes explaining what to expect and do about it.

Dell could have handled the situation differently, but I don't really think they were wrong to continue shipping parts out. It sure sucked to RMA a board and receive another bad one in its place. Most business's don't appreciate having their systems down for 2 or 3 weeks at a time.
 
Based on my experience, hundreds of Dimension 45x0 power supplies will also die given enough time, might as well roast them for that too. This problem got so bad Dell shipped us replacements on pallets. As long as they cover the replacement for problem XYZ, I do not see what the problem is. This is not like Dell was shipping out Optiplex systems then started counting the days until they knew the system would die.
 
Hello Acer.

Seriously? I use Acer stuff at home but I'd never put them in a business environment, except their monitors.

We use hp and can't really complain. They fry as often as anything else but the support is good, at least in my experience.
 
You shure about that? Almost every time I am forced to call into a call center, the person on the other end usually says that the systems are going extremely slow today.

That may or may not have anything to do with the agent's desktop. Most likely it does not.
A lot of call centers install just enough PC power to bring up a 3270 emulator to connect to a mainframe. It is then just the mainframe access that is slow, not the PC itself.
A lot of call centers also just use a web browser to access their CMS, and it doesn't take a lot of cycles to render the HTML forms and CSS for those.

I was the admin for a call center with about 500 PCs in the early aughts. Swapping out the standard K6-2 400s for a PIII 1GHz didn't do anything to bolster productivity in what I tested.
I tried to get those 500 K6-2s folding but they took forever to complete WUs.
 
Just an FYI: I got on dell's chat support and asked if they were going to extend the warranties or replace motherboards outside the warranty period because of the capacitor issue (after the NYT article). after a little back and forth banter, I got her to admit there *was a program to replace motherboards with bad caps, but it ended in January 2008. Since we had 4 year warranties on our PCs they were already being replaced in that time frame. I didn't get her to admit they would replace any, but I suspect if we pushed our sale goon we could get some with obvious bad caps replaced.
 
I think the bigger issue that came out at trial was that even after they knew there was a problem they told their support personal to say that there were no known issues and to treat it like a freak occurrence.
 
Here is more information about the source of the problem.

http://www.badcaps.net/pages.php?vid=4

W h a t C a u s e s T h i s D i s a s t e r ?
How did this happen?
The reason this problem exists is because of a large-scale industrial espionage foul-up. Some companies decided to steal an electrolyte formula from another competitor. Little be known to them, the stolen formula was incomplete and flawed. They didn't discover this until it was too late and they had manufactured and distributed literally MILLIONS of these flawed capacitors. It was way too late for any kind of recall, and even today, these crappy components are being used in new boards. As I mentioned before, I believe this problem runs much deeper than simply an industrial espionage screw-up, as that incident was exposed years ago, and the problem still exists today. Nowadays, it just boils down to corporate bean counters cutting corners to save money by using shoddy components.
 
Depends on the work of the end-users. A P4 is fine for most callcenter desktops.

NO WAY! I am someone who has worked in callcenters for 4 different companies with every single one thinking "Oh yea a p4 is enough". You know the problem? Web applications that use java and extremely heavy scripting and xml software packages that make that p4 feel more like a 386 running windows 3.1 (crashes just as much too).

Also in every single on of these companies you had to have like 20 different things all open at once in order to do your job.
 
NO WAY! I am someone who has worked in callcenters for 4 different companies with every single one thinking "Oh yea a p4 is enough". You know the problem? Web applications that use java and extremely heavy scripting and xml software packages that make that p4 feel more like a 386 running windows 3.1 (crashes just as much too).

Also in every single on of these companies you had to have like 20 different things all open at once in order to do your job.

so true. i've done work in 3 call centers, each one had its own unquie mix of web applications and call 'tools' that were inefficient and slow.

one was running two isdn lines for a connection to support about 200 reps. the client wanted us to try out their new logging system hosted on another coast 'to help ensure quality'. it was a quite possibly the worst ui I have ever seen and required every rep to manually populate more than 20 fields (even their own name!). what made it really suck was that it was slow and would hang on every field before committing it and letting you move on. our response was basically 'this sucks, we cannot use it'. 2 weeks later they shut off the old system and forced us to the new one. once all the reps switched, It bogged down so bad you could not fill out everything in less than the expected call time. the client would not let us upgrade the isdn connection either. quality went right down the shitter and we had 100% employee turnover every 3 months until the client moved all the jobs to india.
Posted via [H] Mobile Device
 
I read this thread with interest both due to hardware and also 1st hand user experience.

1. Experienced relevant issues in a large scale environment years ago.
2. My apology obviously I cannot provide reference.
3. I am not agreeing with how it should be handled. I am just providing un-substantiated info.

4. The problems happened to many computer vendors. Do not think it is only restricted to certain company.
5. In one of my previous employer's case, we bought one particular brand (company standard) desktops in bulk. We were surprised many machines had persistent random hang. We thought network/software/malware problems and did a lot of repeated system OS rebuilt for many months.
7. It was by chance that we read on the Internet about capacitor problem. We also saw some capacitors showing signs as described. We managed to get the principal to send official onsite to inspect.
8. Even under such circumstance, the principal told us frankly they could only replace those which actually failed completely. Even then, they did not have replacement motherboards and needed to wait for some times.
9. We switched to another brand shortly after that. Not because others won't fail, but more due to better failure handling response.
 
NO WAY! I am someone who has worked in callcenters for 4 different companies with every single one thinking "Oh yea a p4 is enough". You know the problem? Web applications that use java and extremely heavy scripting and xml software packages that make that p4 feel more like a 386 running windows 3.1 (crashes just as much too).

Also in every single on of these companies you had to have like 20 different things all open at once in order to do your job.

I've done contracting work at dozens of call centers and it is the same story everywhere. A web-based CMS, the PBX client, and one or two more support pages (upsells, parts lookup, etc). Maybe a 3270 emulator thrown in. Nothing that ever taxed the boxes.
One company had a crazy AJAX inline autocomplete for 411 style directory services, it smoked the crap out of some backend hardware but it didn't even register as a blip on the CPUs of the agent desktops.

As the fulltime sysadmin I polled perfmon for analytical purposes, all of our boxes were underutilized to an extreme. We even experimented with underclocking to reduce power draw on the Symmetras. If the workstations were slow, people would complain every time. If a program had a memory leak and the box started paging, the agent would bitch about it and of course a reboot fixed the issue. It was always the supervisors who needed the bigger boxes, Crystal Reports wanted to eat up all that extra RAM.
If faster desktops would have made us more money, we would have bought them. As it was, we tried it and it didn't help anything. If it would have boosted sales or productivity, I could have made a business case for the upgrades... but it didn't.
 
I've done contracting work at dozens of call centers and it is the same story everywhere. A web-based CMS, the PBX client, and one or two more support pages (upsells, parts lookup, etc). Maybe a 3270 emulator thrown in. Nothing that ever taxed the boxes.
One company had a crazy AJAX inline autocomplete for 411 style directory services, it smoked the crap out of some backend hardware but it didn't even register as a blip on the CPUs of the agent desktops.

As the fulltime sysadmin I polled perfmon for analytical purposes, all of our boxes were underutilized to an extreme. We even experimented with underclocking to reduce power draw on the Symmetras. If the workstations were slow, people would complain every time. If a program had a memory leak and the box started paging, the agent would bitch about it and of course a reboot fixed the issue. It was always the supervisors who needed the bigger boxes, Crystal Reports wanted to eat up all that extra RAM.
If faster desktops would have made us more money, we would have bought them. As it was, we tried it and it didn't help anything. If it would have boosted sales or productivity, I could have made a business case for the upgrades... but it didn't.

There were several Java apps on our PC's that had tons of bugs which would made us frequently restart the programs. One of the companies even made us access their billing system in a Java client. Starting a bloated Java program on an old p4 would take at least 2min. When you work in a call center every second counts. Waiting 2min for a program to load reduces your credibility with the customer, increases your call time and possibly your pay check along with it. The harder I pushed those Java apps the faster they went down. Corporate types love it but I hate Java with a burning passion. Same goes for every other web app that times out when you don't use it for a mere 15 minutes.

I used to keep my task manager open all day long, that CPU never went under 25% while taking back to back calls and would routinely be 80-100% while one of the many tools was being used.

Don't get me started on the Symantec antivirus scanning in the background. Every time Symantec started working (which was once an hour) I would have to wait about 2-3min for it to finish. The computer would be barely responsive and totally useless. The software was doing some type of quick scan or some inventory I don't know exactly what. It sure wasn't doing a full antivirus scan because the process would go to 0% after the computer was responsive again and the disk thrashing would stop.

Oh yes AJAX is everywhere and boy is it a PITA increasing call times. Especially when you are tied to IE6! It's like the developers never realized that IE6 wasn't made for that. It's so nice when you submit a form and then all the IE windows on the entire computer become unresponsive waiting for that one form on the one window to complete submission. Oh wait, the developers of this software were running dual core machines with 2-4GB of ram and a much faster FSB. The developers didn't have to sit in front of a machine 12hours a day taking back to back calls and experience the uselessness first hand, all while being grilled by management to get off the call, grilled by the customer for having to wait, and having your stats take a hit for something that isn't your fault. So then you rush the next call probably not helping the customer correctly just to get back into range, because I'll be damned if I'm going to take a hit in my wallet for someone else's poor choices. As long as you do it with a smile management doesn't even care lol
 
There were several Java apps on our PC's that had tons of bugs which would made us frequently restart the programs. One of the companies even made us access their billing system in a Java client. Starting a bloated Java program on an old p4 would take at least 2min. When you work in a call center every second counts. Waiting 2min for a program to load reduces your credibility with the customer, increases your call time and possibly your pay check along with it. The harder I pushed those Java apps the faster they went down. Corporate types love it but I hate Java with a burning passion. Same goes for every other web app that times out when you don't use it for a mere 15 minutes.

I used to keep my task manager open all day long, that CPU never went under 25% while taking back to back calls and would routinely be 80-100% while one of the many tools was being used.

Don't get me started on the Symantec antivirus scanning in the background. Every time Symantec started working (which was once an hour) I would have to wait about 2-3min for it to finish. The computer would be barely responsive and totally useless. The software was doing some type of quick scan or some inventory I don't know exactly what. It sure wasn't doing a full antivirus scan because the process would go to 0% after the computer was responsive again and the disk thrashing would stop.

Oh yes AJAX is everywhere and boy is it a PITA increasing call times. Especially when you are tied to IE6! It's like the developers never realized that IE6 wasn't made for that. It's so nice when you submit a form and then all the IE windows on the entire computer become unresponsive waiting for that one form on the one window to complete submission. Oh wait, the developers of this software were running dual core machines with 2-4GB of ram and a much faster FSB. The developers didn't have to sit in front of a machine 12hours a day taking back to back calls and experience the uselessness first hand, all while being grilled by management to get off the call, grilled by the customer for having to wait, and having your stats take a hit for something that isn't your fault. So then you rush the next call probably not helping the customer correctly just to get back into range, because I'll be damned if I'm going to take a hit in my wallet for someone else's poor choices. As long as you do it with a smile management doesn't even care lol

I feel your pain man, I work in a call center as well and the only thing I dislike more than Java is Siebel. Honestly, everything should be XML apps pulling data via a single VPN connection. This would fix MANY things.
 
The capacitor issue killed Epox off the market. When mine went it took my 6800Ultra with it. Damn near started a fire too.
 
We had IBM 300XL's dropping like flies after about 2 years of age.

And ABIT I think was the first MB manufacturer to really have to deal with it in large scale. IIRC they took care of me pretty darned good though.

Was it not this issue that pretty much killed ABIT in its original form? Like you I remember they were one of the few companies to look after their customers over this, and also to admit the problem. Now look at them. :(

It's a fucking shame that the bigger bastard you are in life (in both a personal and corporate sense) the more you seem to be rewarded.

(As an aside I still have a ABIT KG-7 lite running in my old backup rig)
 
Back
Top