TLER / CCTL / ERC thread

Wolvenmoon

Weaksauce
Joined
Apr 6, 2009
Messages
67
Okay guys, I have been smashing my head against a freaking wall with Samsung, Seagate, Western Digital, and Hitachi customer support.

I grabbed 2 WD drives, popped in my handy dandy modded UBCD 4.11, went to disable TLER, got an error, and that started my incredibly frustrating journey. I've lumped WD's customer support and been pestering Hitachi and Seagate. Samsung's been very quiet.

I have Hitachi saying "You can use CCTL on our consumer drives if you want." Seagate's sounded either slightly confused or threatening going "No, you can use any of our drives in RAID-5, they'll just break faster." My forum research has led me in so many freaking circles I am finally tired of it.

Since I'm a staff member of www.reclaimyourgame.com and I've ran headfirst into a metal bar of defective by design hardware, I'll be writing an article with everything I find out and I will be linking to this thread. My face when I realized that I was honor bound to raise a stink over this was priceless. Somewhere between 'holy crap' and 'I was supposed to be on VACATION!'.

So, seeing this, http://hardforum.com/showthread.php?t=1482622&highlight=TLER&page=19 , I think I'm in the right place.

Let's put a silver bullet in this beast's head. I'm going to just throw it all to you guys and see what you all have. I'll second post with my research so far.

-What brands allow you to persistently adjust the error handling times, regardless of the name, on their newest consumer drives as of November 16th 2010?

-What utilities do you need?

-What brands allow non-persistent changes to their consumer drives error handling times, in what operating systems, and what utilities do you need to do this?

-What brands explicitly do not allow modification of these settings? ( We all know, this is for the purpose of consolidation ).

-What brands, if any, have other traps to RAID users implemented in their disks, and what do you need to take care of it?

And lastly:

-What controllers do CCTL/TLER/ERC/ETC. not matter at all on and how much do they cost retail ( unless integrated ).
 
My research in summary:

http://hdat2.getphpbb.com/hdat2-comm...c-t186-15.html
http://hdat2.getphpbb.com/hdat2-comm...-7-1-t201.html
http://forum.hddguru.com/samsung-cct...er-t11630.html
http://forum.qnap.com/viewtopic.php?f=24&t=25730

^--Basically those all will lead you on a wild goose chase to find non-persistent SMART commands to issue to your drive to limit error handling time.

http://forums.pcper.com/showthread.php?p=4474917#post4474917 <--Me freaking out, not much info presented.

http://wdc.custhelp.com/cgi-bin/wdc...dp.php?p_faqid=1397&p_sid=zkI2Y8fk&p_lva=1478

http://wdc.custhelp.com/cgi-bin/wdc.cfg/php/enduser/std_adp.php?p_faqid=1478

WD's official FAQs, in particular I like this part:
"...While TLER is designed for RAID environments, a drive with TLER enabled will work with no performance decrease when used in non-RAID environments."

Hmm, I wonder why? :eek:

A chart I pulled out of WD's support: http://www.wdc.com/en/library/sata/2579-701239.pdf?wdc_lang=en It shows supposed support for consumer drives on old RAID controllers and is fairly irrelevant today.


Hitachi's reply to my e-mail asking "Does hitachi allow CCTL to be enabled on their consumer level drives?"
Hitachi GST said:
Hello (Wolvenmoon),
Our consumer level drives have no issues running in a small RAID environment such as the one you describe. The CCTL feature can be enabled using our Feature Tool, freely available on our downloads page. Usually though this feature doesnt enable or prevent a drive from being used in a RAID setting unless this is a requirement of your controller/array. If you have any other questions, please feel free to ask.
Best Regards,
Art

It sounds definitive, I want a double check because I was taught to ask three times before I drop a Benjamin on something. :D
 
Welcome to the nightmare.

For years WD allowed you to enable TLER on THE "EADS" (WD10EADS,WD20EADS) series of green drives, this has been disabled on newer drives.
Some people get lucky and get a drive you can still enable it on, but if you have to RMA it chances are you'll get one back that doesn't allow it. The WD RE series still allow you to enable TLER, but they cost quite a bit more.

From my understanding, Hitachi, Samsung and Seagate support CCTL but the setting doesn't survive a power cycle, so the Raid Controller will need to issue the command each time the array is initialized. Supposedly some of the high-end Adaptec cards support this.

Here's a huge thread, some of my own experiences with these drives in here as well:

http://hardforum.com/showthread.php?t=1285254
 
Let's give them a nice wake up call then. Keep bringing the information.

Re-read the end of my previous post about hitachi. Since I have some hitachi drives I intended to load up their disk tools and see if CCTL settings are exposed. They're 160 gig SATA-1 drives though, so they might be too old. I'll go ahead and do that tonight. Edit: This is apparently a lie on their part. I'll go ahead and put on my official voice with them and see what I get...tomorrow though. This is quite stressful.

I scanned through that thread. In summary WD has disabled this feature on not only the green drives but also the blue and black drives. Makes sense on the green and blue drives. The black drives, marketed as enthusiast drives, it doesn't make sense for.

I would like to say that from what I can tell, high end $600+ controllers, the types you'd find in an enterprise environment, don't have trouble with hard drive error recovery and dropping them from arrays because they can have their timers set anyway. This is targeted squarely at the consumer market.
 
Last edited:
I would like to say that from what I can tell, high end $600+ controllers, the types you'd find in an enterprise environment, don't have trouble with hard drive error recovery and dropping them from arrays because they can have their timers set anyway. This is targeted squarely at the consumer market.

What RAID controllers do you know of that allow you to set the timeout in seconds after which it considers a drive to have failed and drops it?
 
I would like to say that from what I can tell, high end $600+ controllers, the types you'd find in an enterprise environment, don't have trouble with hard drive error recovery and dropping them from arrays because they can have their timers set anyway. This is targeted squarely at the consumer market.

I've been having this nightmare too (as detailed on my site Jons Guides). I too would be interested to know what controllers allow you to set the timeout value - I can't find any information about any such controllers. My Areca 1220 doesn't seem to offer any option to configure this.

I should also point out that this isn't an "in theory" problem for me - I have actually had to rebuild my raid array many times due to drives being kicked out, on average once every month-6 weeks for over 2 years. None of the drives have shown any other problems, and all have a healthy smart status.

I have found that by putting my server in the cellar of the house, where the temperature is never about 10-12C that the problems have reduced significantly, so for others I can recommend a cold room - I assume it reduces the drive error rate somehow.

I have also e-mailed Hitachi. Their response seemed deliberately vague to me. I suspect the drives "support" the feature as required by the ATA specifications, but there is no way to enable it permanently (see emails below)

Thank you for contacting Hitachi Technical Support,

CCTL is a feature of the drive and as far as we know the drives are compliant with it.

Regards,

Hitachi Technical Support


> Dear Sir/Madam,
>
> I am looking to purchase a number of 2Tb consumer-grade hard disk drives for
> use with a hardware raid controller. I have heard conflicting reports with
> regard to support for CCTL on Hitachi Deskstar drives (specifically 7K2000
> and 7K3000 drives). Could you confirm the status of CCTL support on these
> drives, specifically, is it possible to enable CCTL with the Hitachi Feature
> Tool or via a similar method.
>
> Thank you for your assistance,
>
> Jonathan Scaife

If anyone knows of any drives that can have ERC/CCTL/TLER enabled, that don't cost 2x the normal consumer drive price I'd be very interested. Ditto if anyone knows of ways to make a controller give drives longer before kicking them from an array.
 
I'm a poor dude else I'd not be looking at consumer drives for a RAID array on a software controller - which SHOULDN'T give you problems according to WD - but does anyway. Pre-TLER my RAID 1 array rebuilt every other week. Post-TLER they've been perfect little bi---drives since.


I'm about to go stand on hitachi support in my professional voice. Could you please send me your ticket number in PM so I can reference it and politely back them into a corner?
 
I'm a poor dude else I'd not be looking at consumer drives for a RAID array on a software controller - which SHOULDN'T give you problems according to WD - but does anyway. Pre-TLER my RAID 1 array rebuilt every other week. Post-TLER they've been perfect little bi---drives since.


I'm about to go stand on hitachi support in my professional voice. Could you please send me your ticket number in PM so I can reference it and politely back them into a corner?

I don't have much spare cash myself. And more to the point the drives are for a home-based storage server - for the family's backups, photos, music and recorded tv - it's not a heavy-usage environment requiring enterprise grade drives. I only have the Areca card as it was saved from a server going into a skip by a friend who had no use for it!

I need a very large storage solution that is a bit more reliable than a span of drives. A RAID-5 of consumer drives is perfect for making it reliable enough for purpose - except that IMO this option has basically been deliberately broken by the drive manufacturers. I'm not going to pay 2x the price of normal storage to build an enterprise-grade storage system for a family. Just like I'm not going to buy an all-singing-&-dancing managed cisco switch for a family! Or a cabinet. Or a rack-mount server. :)

/rant
 
Personal experiences with this issue:

WD 2tb drives:
I have 8xWD20EADS with TLER enabled. My circa-2008 Areca 1680ix-12 (current firmware on the card and SAS Expander) would kick a drive or two for no reason every 2 - 6 weeks. The array would rebuild fine. I switched the drives to pass through and built a Linux software RAID. No performance difference. Its been 3 weeks. We'll see.

Seagate 2tb drives:
I have an 8x2tb array created using Linux software RAID. I enabled ERC with smartctl. No problems at all. The drives are connected through an LSI 8 port PCIe SAS card in HBA mode. I also tried my two spares with a 3ware 9650se. The drives kept disappearing from the RAID controller. No luck with smartctl.

Hitachi 2tb drives:
At work, I have 40 of these drives in various arrays - Areca and Adaptec. No problems at all and the drives scream. If I were buying drives today, this would be my only choice. I never bothered with trying to enable TLER.

Hitachi 500gb 2.5" EA series:
I have 4 of these on an Adaptec 2405 in RAID 10. This combination also screams. No problems without TLER.

In summary, buy Hitachi and love it.
 
Personal experiences with this issue:

WD 2tb drives:
I have 8xWD20EADS with TLER enabled. My circa-2008 Areca 1680ix-12 (current firmware on the card and SAS Expander) would kick a drive or two for no reason every 2 - 6 weeks.

They were kicked out with TLER ENabled?

Hitachi 2tb drives:
At work, I have 40 of these drives in various arrays - Areca and Adaptec. No problems at all and the drives scream. If I were buying drives today, this would be my only choice. I never bothered with trying to enable TLER.

Really useful information, cheers! If I can't get any cheap-drives with ERC then I'm inclined to gamble with Hitachi drives. I've already had drop-outs from WD, Seagate and Samsung, so might as well try Hitachi. Haven't bought Hitachi drives since they were IBM drives back in the bad days of the GXP failures!
 
They were kicked out with TLER ENabled?

Really useful information, cheers! If I can't get any cheap-drives with ERC then I'm inclined to gamble with Hitachi drives. I've already had drop-outs from WD, Seagate and Samsung, so might as well try Hitachi. Haven't bought Hitachi drives since they were IBM drives back in the bad days of the GXP failures!

Well it was only a whole decade ago that Hitachi bought IBM out and started over, but "deathstar" does have a nice ring to it so why not keep it going? :) Meanwhile there are those of us that have been quietly and contently running desktop class Hitachi's in RAID for years with zero problems or dropouts, happily sitting back and watching everyone chase their tails.

I challenge you to get a Hitachi to drop out of an array, and if it does, it likely has little to do with its particular CCTL timeout value and more likely due to bigger problems with that particular drive. I'm not a subscriber to the popular belief that otherwise healthy drives with zero bad or remapped sectors are dropping like flies out of arrays because of a lack of decreased error recovery timeout setting. I can appreciate that lots of people noticed a difference in drives no longer dropping out of arrays when enabling TLER on their WD drives, but I'm skeptical that the dropouts were being caused by those drives entering a deep error recovery operation in the first place.

I think enabling TLER on WD drives has the *side effect* of masking some other issue that makes certain desktop class WD drives raid-unfriendly, and take too long to perform some internal operation which doesn't necessarily have to do with error correction or bad sector reallocation. I say that because other brands clearly don't suffer from same, without ever having to make a change to their default deep error recovery timeout.

The bottom line is the whole issue of TLER is case-by-case among different makes and models of drives and affects all of them differently. They have differing default timeout values to the internal operations they perform, which also vary among makes and models. That's why it's unrealistic to expect a WDTLER type tweak tool from every vendor, because in many cases it's an answer in search of a problem.
 
Last edited:
I challenge you to get a Hitachi to drop out of an array, and if it does it has nothing to do with TLER/CCTL/whatever. The truth is that otherwise healthy drives aren't dropping out of arrays because of their particular deep error recovery timeout setting, because otherwise healthy drives wouldn't be *entering* a deep error recovery cycle on any regular basis, and if one does then it's probably got bigger problems anyway and I would want it kicked out. But much like the term "deathstar", the FUD is self-perpetuating.

So what is the timeout for error recovery on Hitachi drives?

I think the major concern for desktop drives without TLER is that if one drive happens to fall out of the array do to it entering deep recovery, the rest of the drives are then put under additional load during rebuild..increasing the chances of failure.

Before I knew anything of this problem, I had 5 x WD20EADS drives setup in a Raid5 on and intel SRCSAS18E controller. It didn't last 24 hours before one of the drives fell out of the array. When I pulled it out and did a sector scan on it, it checked out fine.

Thats when I learned about the existance of TLER, when I enabled TLER on the drives and re-created the array, it ran without any issue whatsoever for almost a year when one of the drives failed..
 
Like I said, I realize there were noticeable changes when changing TLER to 7 seconds on WD drives, but I'm skeptical that your drives were falling out of the array within 24 hours due to things like bad sectors actually being encountered. Otherwise healthy drives simply don't enter into a deep error recovery cycle with that kind of frequency. The culprit had to be something else.

I grabbed 8 x WD20EADS myself when they first came out, and noticed dropouts too even on an Areca controller, but when I'd connect them directly in JBOD mode and looked at their smart values, reallocated sectors, etc. the "failed" drive would always be 100% healthy with zero errors, which made me think there's some other normal internal operation that just happens to take too long on certain WD drives, and gets masked when TLER is enabled.

You might say "who cares why the dropouts are happening if enabling TLER fixes the issue" and I'd suggest that people not bothering to question why its happening is the cause of all the misunderstanding about it, and the common assumption that every make and model of drive needs a tool to change the deep error recovery timeout value or your RAID array is doomed.
 
Last edited:
Seagate CS has been very responsive since I pulled the professional card.

Hitachi CS I'm waiting on.

Samsung CS blew me off. Can't recommend their drives nor their support. I sent a nastygram to their public relations. ( When I say nastygram I mean exceedingly polite to the point of being nerve wracking ).

WD I'm about to open a ticket and ask them professionally what their stance is.
 
Can you expand on that? What did they say?

Samsung Technical Support said:
&#12288;

Dear Customer,

In reference to ytour RAID issue, We suggest to contact Samsung technical support : 1-800-726-7864


for further assistance of your issue.

Obnoxious spaces deleted.

This was in response to:

Wolvenmoon to Samsung Tech support said:
Dear Samsung,

My name is *censored* and I work with www.reclaimyourgame.com , a business specialized in testing computer game software licensing enforcement but also an organization that advocates for digital consumer rights.

I e-mailed Samsung customer support last week about this issue and did not receive a reply. It's possible something in my browser blocked it or that it was lost. I also am having difficulty navigating Samsung's drive model selection.

*Snip secret sauce to article that will be revealed in it*

What I would like from Samsung are the following:

-A brief explanation of what CCTL is in their own words.

-If CCTL can be enabled on consumer level drives.

-If not, why Samsung does not allow this. If so, how.

I am asking this because what I have to go on right now is that Samsung actively prevents consumers from using their drives in RAID arrays and that the difference between a RAID level Samsung drive and a consumer level drive is the CCTL setting. I do not want to say something that is untrue. My readers are technically inclined as am I and advertising for the enterprise drives will not work. They will demand an under the hood look.


Thank you very much for your time,
*name censored*


My first letter was much less stand-off-ish. Unfortunately I didn't have the foresight to save it from their form.

Edit: The snip says exactly what I'm doing with this article. Spoilers aren't fun. ;)
 
CCTL stands for Command Completion Time Limit, and as I understand it, is part of the SMART specification. So any drive that supports SMART will support an error timeout of some kind.

These are desktop class drives, samsung isn't "actively" preventing you from using them in a Raid array, but they aren't going to support them in such a configuration most likely.
The deep error recovery that these drives go into is actually a good thing in a non-raid config.

Also, a quick google of CCTL yields the following link as the first result:

http://www.samsung.com/global/business/hdd/learningresource/whitepapers/LearningResource_CCTL.html

This CCTL setting can be set on samsung drives making them usable on raid arrays, unfortunately this setting is a "soft" setting and doesn't survive a power-off.
Meaning that if your controller can issue this command each time it's initialized, the timeout won't be an issue.
 
I know exactly what it is, I'm just letting them define it in their own terms. I would hate to put words into their mouths.
 
Well it was only a whole decade ago that Hitachi bought IBM out and started over, but "deathstar" does have a nice ring to it so why not keep it going? :) Meanwhile there are those of us that have been quietly and contently running desktop class Hitachi's in RAID for years with zero problems or dropouts, happily sitting back and watching everyone chase their tails.

Its more a case of just haven't happened to buy hitachi. I switched to seagate for my first couple of SATA drives, and since then have had WD's, mainly raptors. I know there haven't been issues since IBM sold the biz to Hitachi

I challenge you to get a Hitachi to drop out of an array, and if it does, it likely has little to do with its particular CCTL timeout value and more likely due to bigger problems with that particular drive.

That's reassuring confidence :)

I think enabling TLER on WD drives has the *side effect* of masking some other issue that makes certain desktop class WD drives raid-unfriendly, and take too long to perform some internal operation which doesn't necessarily have to do with error correction or bad sector reallocation. I say that because other brands clearly don't suffer from same, without ever having to make a change to their default deep error recovery timeout.

I've had drive dropout issues with a Seagate ST31000528AS, various WD drives, and Samsung HD103UJ's so it isn't just a WD problem.

That said, since using WDTLER on 2 of the WD drives I haven't had any more drop-outs from any drives. Its been a couple of months or so...
 
I've had drive dropout issues with a Seagate ST31000528AS, various WD drives, and Samsung HD103UJ's so it isn't just a WD problem.

which raid controller were you experiencing the dropouts with? not every dropout is caused by the same thing, for example certain models of raid controllers with a particular integrated expander are incompatible with certain drives, and over time I've seen plenty of threads where people just assumed TLER/CCTL/ERC as the culprit. i still maintain that different combinations of drives and controllers each have a unique set of circumstances, which may or may not lead to dropouts and other issues, which may or may not have to do with the TLER/CCTRL/ERC timeout value.

Apparently, http://hardforum.com/showthread.php?t=1482622&page=19 , Hitachi is shipping out a new wave of 2TB drives. This has me nervous. Heh heh.

What's the relevance of a new model of Hitachi 2TB drive, and why does it have you "nervous" even though you have no personal hands-on experience with Hitachi's, at least that you've stated?

Don't mean to be a brick but a lot of assumptions are being made in this thread and I wanted to offer some measure of counterpoint. As the saying goes, "stick to the facts."
 
Last edited:
What's the relevance of a new model of Hitachi 2TB drive, and why does it have you "nervous" even though you have no personal hands-on experience with Hitachi's, at least that you've stated?

Don't mean to be a brick but a lot of assumptions are being made in this thread and I wanted to offer some measure of counterpoint. As the saying goes, "stick to the facts."

My issue is that Hitachi may decide to 'do what everyone else is doing'. We don't know if these new drives will work like the old ones.

I've been doing back and forth with customer support groups all week, researching my arse off, I'm fairly certain of my facts because the divide and conquer method has worked excruciatingly well with these companies. I'm giving WD, Samsung, Hitachi, and Seagate another business day to finish up saying what they're going to say, one more to get any final questions in, and then I'm closing up research on this.



I really don't care if error handling isn't the root cause of all RAID problems for consumer drives. It IS *A* problem for RAID arrays with consumer drives and the fact is it's a very easily preventable problem.
 
which raid controller were you experiencing the dropouts with? not every dropout is caused by the same thing, for example certain models of raid controllers with a particular integrated expander are incompatible with certain drives, and over time I've seen plenty of threads where people just assumed TLER/CCTL/ERC as the culprit. i still maintain that different combinations of drives and controllers each have a unique set of circumstances, which may or may not lead to dropouts and other issues, which may or may not have to do with the TLER/CCTRL/ERC timeout value.

2 Areca 1220's. No expanders.

Don't mean to be a brick but a lot of assumptions are being made in this thread and I wanted to offer some measure of counterpoint. As the saying goes, "stick to the facts."

That's fair enough, but I have had problems with drives dropping out of my array when there is nothing wrong with them. Having used WD TLER tool on 2 of the WD drives, those 2 drives have NOT dropped out since, wheras before they frequently dropped out. So I'm pretty confident of my diagnosis.

I don't want to spend several hundred quid on new drives only to find they drop out of the array all the time, so i'd like drives where I can enable TLER/ERC/CCTL
 
I think your going about this the wrong way. Instead of singling out the HD manufacturer's why don't you look at the RAID controller manufacturer's and why they don't let you set specific time out values in their firmware.

As much as some people like to think on this forum, the consumer drives are not the identical copies of the enterprise drives the OEM's make. The enterprise drives have different firmware, different parts, and are manufactured to withstand higher operating temperatures and tolerances. They are not 'double the price' just for the hell of it. That's like saying Windows Server and the Desktop client are the same because they both have a GUI and a mouse cursor.

As many have pointed out before, the issue with dropouts is with hardware RAID controllers. Software RAID has none of these issues. I wonder why?

If the RAID controller manufacturer's primary customers are end users then they should cater to these people and test their controllers with consumer drives and tweak their firmware as needed. The Hard drive OEM's primary clients are large resellers, not people on Hard Forum. They have no reason to change what they do, all their money comes from the Dell/HP/Storage vendors. Not Newegg.com.
 
Last edited:
@danman

I would bet that if someone ripped the firmware from a WD RE3 drive and put it on an equivalent sized caviar black SATA II drive you'd have a working drive. Really the only company on my @&*!list as a consumer is WD

Seagate has said they manufacture their enterprise drives to a higher standard than their desktop drives. Hitachi has said that as well. I believe Hitachi and since Seagate has been the most sincere customer support I've talked to over this I'm going to believe them as well. I haven't heard anything from Samsung and I'm waiting on WD.

There's this damning thing called interface, though. I don't know of any enterprise harebrained enough to run SATA drives over SAS drives in servers where high availability is critical. https://secure.wikimedia.org/wikipedia/en/wiki/Serial_attached_SCSI#SAS_vs_SATA


Do large businesses really put 48 SATA drives in a 4U rackmount with 10 gigabit or 40 gigabit connections? I can see a small business with a single building and 15-20 employees using 6 to 8 SATA drives on the cheapest controller they can get - probably built into their motherboard, but I've always assumed that once you got bigger than that, you bought SAS.

@no one in particular
After Samsung's reply from their customer support ( still waiting on their PR person ) I found myself unwittingly subscribed to their newsletter on both my personal and professional e-mails. I'm fairly sure I was paying attention to their form and didn't check a box saying 'please subscribe me to your newsletter', because I generally uncheck them.

So I went to unsubscribe.

Samsung unsubscribe page said:
unsubscribe
We're sorry to see you go. Please allow 7-10 business days for us to process your request.
*Headdesk*
It's an SQL database. I think they need a faster storage system.
 
I think your going about this the wrong way. Instead of singling out the HD manufacturer's why don't you look at the RAID controller manufacturer's and why they don't let you set specific time out values in their firmware.
I already raised that issue at the start. I also linked to my blog where i've made the same point - I don't care how the problem is solved but I need to find a solution.

As much as some people like to think on this forum, the consumer drives are not the identical copies of the enterprise drives the OEM's make. The enterprise drives have different firmware, different parts, and are manufactured to withstand higher operating temperatures and tolerances. They are not 'double the price' just for the hell of it. That's like saying Windows Server and the Desktop client are the same because they both have a GUI and a mouse cursor.

Windows Server and Client are almost entirely the same codebase - Microsoft would be crazy NOT to do that. Just like CPU manufacturers test and bin CPU's by speed (or working cores). Ditto with the HDD manufacturers - the basic hardware design is often very similar, if not identical. Sadly the HDD manufacturers seem to only want to sell me either a horse-and-cart, or a bugatti veyron. I'd just like a middle-ground drive - and I wouldn't mind paying 20% more for it either. I do object to paying 100% more for a firmware change - and that IS the difference between some of the WD drives. I have personally flashed WD RE firmware onto non RE drives which works faultlessly.

Incidentally - they are "double the price" because that is the price that the market will pay, not because they cost a penny more to make. That's how capitalism works.

As many have pointed out before, the issue with dropouts is with hardware RAID controllers. Software RAID has none of these issues. I wonder why?
I believe Wolvenmoon said he's had problems with an intel "software" raid in this very thread? Also - if I had £600 to blow on a new raid controller, I could also afford to throw the same cash at enterprise-grade drives. I don't. I already have a raid controller, and I need some cheap drives that will work with it, for a low intensity working environment.

If the RAID controller manufacturer's primary customers are end users then they should cater to these people and test their controllers with consumer drives and tweak their firmware as needed. The Hard drive OEM's primary clients are large resellers, not people on Hard Forum. They have no reason to change what they do, all their money comes from the Dell/HP/Storage vendors. Not Newegg.com.

I don't think anyone has, or will, argue this with you. I am aware that I am a vanishingly small market, but that doesn't make any difference to my search for solutions. I posted on here to contribute to the search for a solution to a problem. You've provided some very helpful information about Hitachi drives. Settle for that :)
 
I already raised that issue at the start. I also linked to my blog where i've made the same point - I don't care how the problem is solved but I need to find a solution.

Sure, ok

Windows Server and Client are almost entirely the same codebase - Microsoft would be crazy NOT to do that. Just like CPU manufacturers test and bin CPU's by speed (or working cores). Ditto with the HDD manufacturers - the basic hardware design is often very similar, if not identical. Sadly the HDD manufacturers seem to only want to sell me either a horse-and-cart, or a bugatti veyron. I'd just like a middle-ground drive - and I wouldn't mind paying 20% more for it either. I do object to paying 100% more for a firmware change - and that IS the difference between some of the WD drives. I have personally flashed WD RE firmware onto non RE drives which works faultlessly.

Server and Client are the same code base but not the same components. Why do you think the client is released 6 months sooner than the server is? Its because the components that are added to the server product are not included in the client.

And the hard drive's are not identical. Where is your proof of this? You just think it is and it's a giant conspiracy to charge consumers more for an identical drive with different software? Come on. They are not the same.

Incidentally - they are "double the price" because that is the price that the market will pay, not because they cost a penny more to make. That's how capitalism works.

They are not double the price because of market forces but because they are kept to a higher standard and cost more to produce. If you buy in bulk then then the cost is a lot lower. Like what the large OEM's pay.


I believe Wolvenmoon said he's had problems with an intel "software" raid in this very thread? Also - if I had £600 to blow on a new raid controller, I could also afford to throw the same cash at enterprise-grade drives. I don't. I already have a raid controller, and I need some cheap drives that will work with it, for a low intensity working environment.

So ask Intel why their software doesn't work with consumer drives. Linux works fine. Why is that?

I don't think anyone has, or will, argue this with you. I am aware that I am a vanishingly small market, but that doesn't make any difference to my search for solutions. I posted on here to contribute to the search for a solution to a problem. You've provided some very helpful information about Hitachi drives. Settle for that :)

I don't work for any HD manufacturer and I am not standing up for them. I just know how business works. Nothing is free and when you make two different products for two different markets and the lower market wants your higher end product for the same price as your low end product you have to ask yourself why? They make two different products and people assume they are the same because they are the same form factor and same interface.

Is select beef the same quality as prime cut? It looks the same. Does it taste the same?

Your rant and Wolvenmoon is full of bullshit. You both want something for free. If you want an enterprise class storage subsystem then pay for it. If you want a consumer class subsystem then you have it.
 
Last edited:
@danman

I would bet that if someone ripped the firmware from a WD RE3 drive and put it on an equivalent sized caviar black SATA II drive you'd have a working drive. Really the only company on my @&*!list as a consumer is WD

I never said the firmware was the entire part of the equation. The parts of the drive are not the same. You problem is that you expect a consumer part to perform the same as an enterprise part. If you ran a hosting service would you run everything off of Windows 7 versus Windows Server? According to you guys they are the same. What's the difference?
 
Last edited:
And the hard drive's are not identical. Where is your proof of this? You just think it is and it's a giant conspiracy to charge consumers more for an identical drive with different software? Come on. They are not the same

I've offered evidence - my experience of firmware flashing, and a common-sense explanation of why if I was a drive manufacturer I would try to use the same hardware, or very similar. If you want to assert the opposite perhaps you can offer some evidence for your assertion?

They are not double the price because of market forces but because they are kept to a higher standard and cost more to produce

Maybe they do cost more to produce, maybe they don't. Why would they cost more to produce? IMO the majority of the difference in price is due to the way the market works. Either way - it wouldn't cost any extra to include the option to enable TLER in the firmware for consumer drives, as WD did in the past, so that issue is irrelevant.

Nothing is free and when you make two different products for two different markets and the lower market wants your higher end product for the same price as your low end product you have to ask yourself why? They make two different products and people assume they are the same because they are the same form factor and same interface.

I don't want an enterprise drive, with the higher quality manufacturing you claim it has, and the better warranty we agree it has. I want a cheap drive that works. I have no issue with charging more for enterprise drives - I would pay more where appropriate - because the premium price includes a premium warranty and premium technical support. As I said before - I don't need 24.7 uptime or longer warranties, or higher reliability hardware, the drives are to be used in a consumer environment, with typical home user use-patterns.

Your rant and Wolvenmoon is full of bullshit. You both want something for free.

No need to be rude or personal. We can disagree in a civilised way. I don't want something for free, I want a consumer class drive, for a consumer environment. I would like it to work. I'm not going to waste any more time arguing what should or shouldn't be the case, or why it is the case, etc. etc. I am simply going to establish which (if any) drives are within my budget and are suitable for my needs.
 
I may be spewing bullshit out of the mouth, but I've had a customer support representative say that consumer drives will work fine in RAID 0 and 1 on any controller. When pressed they responded that it doesn't matter if a drive temporarily stops responding in those levels of arrays.

I don't expect something for free. I expect my computer to do exactly as I tell it to. Right now I demand 24/7 uptime from consumer parts and for the past 3 years I've gotten it. Cool, huh?

The fact is that enterprise environments can enable this functionality in Linux operating systems by automatically issuing the SMART commands every power cycle to prevent the drives from going into long term error recovery cycles. Enterprises aren't affected by this, and if they are, they'll buy a new $1200 controller to be able to issue the SMART commands to the drives. Most can afford to do that. Not sure why an enterprise would run consumer drives anyway...

Because the enterprise drives are apparently, by what you say, always manufactured to a better standard than consumer drives. ;)

In WD's case I KNOW the difference is firmware only and it could be that they bin drives - ones with errors during testing get to be cav blacks, ones without go out as RE. Why? Both knowing how technology works and noticing they don't have anything higher - barring the raptors - than 7200 RPM 'dual processor' SATA drives. I have also literally gotten a "Toasters toast toast" answer out of them regarding TLER being removed on caviar black drives and caviar black drives dropping out of RAID arrays.

Seagate, on the other hand, has provided me with a huge amount of information demonstrating that its enterprise grade drives are manufactured to a higher level than their consumer drives. Seagate also produces SAS drives.

Hitachi has been a bit hard to question directly and I don't own enough of their drives to compare to each other. They HAVE, however, conclusively said that you cannot enable CCTL on their drives.

Samsung has shocked me. I've always loved their monitors and generally anything else I get from them. Their support site was very difficult to navigate, threw errors at me, et cetera, and I got blown off in the end regardless.

I don't expect something for nothing. As Seagate CS said - consumer drives have lower duty cycles, lower performance, and lower reliability than enterprise level drives - this is probably true amongst their lines.


You know, the fact is these drives aren't even performing up to the specifications they give on the box. I expect about 25,000 hours power on time before my drives fail, go check the MTBF numbers we get on the drives' specifications page. I'd have NO problem with needing redundancy if my caviar blacks were lasting 1.2 million hours. :cool:


P.S., I have not tested without TLER on my Intel Matrix storage. These drives were transplanted over from something on my Asus K8VB and the array rebuilt. I'd enabled TLER on them back then. My 160 gig Hitachi drives ran in RAID-0 for 15,000 hours before I took them out of it, and they ran without CCTL, this was done on a Gigabyte K8N pro's nVidia controller.
 
Also, I've spent about 2 hours a day every day for the past 10 work days doing research on this.

Not sure what this forum's policies are - I just go by netiquette standards - but if we get another round of personal attacks I'm going to report you. *Shrug*
 
I hope this will help you some with your article. I too have been frustrated with consumer grade drives timing out of my hardware raid (3ware 9690-4i + 2 Chenbo expanders) and I've now been able to keep everything stable after lots of trial and error. I had everything working well with 1 expander and 16*1tb Hitachi drives for about a year, then the 2tb drives came down enough for me to say yes and do an upgrade.

I curently am using Ubuntu 10.10LTS x86_64, but that was not quite good enough for what I wanted to do. I downloaded the source for smartmontools-5.41, compiled and installed and set the timeout to 7 seconds in the /etc/rc.local file.

root@openfiler:~# smartctl -d 3ware,20 -l scterc,70,70 /dev/twa0;
smartctl 5.41 2010-11-05 r3203 [x86_64-unknown-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

SCT Error Recovery Control:
Read: 70 (7.0 seconds)
Write: 70 (7.0 seconds)

root@openfiler:~# smartctl -d 3ware,20 -i /dev/twa0;
smartctl 5.41 2010-11-05 r3203 [x86_64-unknown-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family: Hitachi Deskstar 7K2000
Device Model: Hitachi HDS722020ALA330
Serial Number: JK1171YAKH3W8N
Firmware Version: JKAOA3EA
User Capacity: 2,000,398,934,016 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Mon Nov 22 20:51:42 2010 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

I found this setting works with the WD20EADS, Hitachi 7K2000 and Samsung HD204UI

I found this SMART feature did not work with the WD20EARS and have since removed them from my arrays. I have not tried this on any other drives then the 4 I mentioned, but I'm sure alot of folks around here would be interested in the results.

I did some initial testing by using SystemRescueCD as it ships with smartmontools-5.40 which is the 1st version to support scterc setting,and the CD also supports more hardware then any livecd I've found. Just be careful working with your ext3 filesystem as it uses e4fs-progs and does some things to ext3 that e2fs doesn't understand.
 
Oh wow.

Yes,mitgib this does help, how many WD20EARS did you try this on?

I intended to end my research period tomorrow, that is very surprising. I just wikipedia'd S.M.A.R.T. and got a reminder that it isn't really an industry standard. What standard are these commands a part of?
 
Oh wow.

Yes,mitgib this does help, how many WD20EARS did you try this on?

I intended to end my research period tomorrow, that is very surprising. I just wikipedia'd S.M.A.R.T. and got a reminder that it isn't really an industry standard. What standard are these commands a part of?

I only have 5 WD20EARS, and I imagine all were from the same lot, but I've seen similar results from other people posting on their tests with smartctl around the net.

As I understand it, and I'd have to search it out again to find where I saw it, but scterc is in the S.M.A.R.T. ATA-8 spec, but none of the drive manufactures have agreed to the spec, so each will support the parts of the spec they feel like. I don't see consumers forcing anybody's hand on that either, and Dell/HP make way too much off reselling enterprise class drives that they could probably care less about the consumer class drives.

Also, one last though, is a prop to Hitachi, they are the only consumer class drives on 3wares tested comparability list
 
Are you sure this whole issue isn't overblown? :confused:

I have several RAID arrays in my various PCs all using consumer drives using Samsung & Seagate drives for years on a variety of controllers (3ware 7506-8, Dell PERC 5i, Dell PERC 6i, Intel ICH9R, Gigabyte / Jmicron controller). Between all the drives on all the different controllers in the various PCs seeing different usage patterns I have only a single instance of a drive dropping from an array for no apparent reason (and then rebuild without issue).

A long time ago when 200GB IDE drives were as big as they got I had all sorts of issues with WD drives dropping from a 3ware 7506-8, so I'm just not seeing this as a hill to die on.

Sorry. :eek:
 
Are you sure this whole issue isn't overblown?

In my opinion, if an HDD has to go into deep recovery mode to try to read a sector correctly, I want to replace that HDD as soon as possible. The deep recovery occurrences are rare enough, in my experience, that I interpret them as an indicator that the HDD is much more likely to fail more seriously (i.e., more than just a bad sector) in the near future, as compared to an HDD that has never gone into deep recovery.

So, if a drive gets dropped from a RAID because it has gone into deep recovery, that is usually a good thing. Assuming I am able to rebuild the RAID. Which is a big assumption that needs more explanation.

I always use at least RAID 6 for any arrays that contain HDDs larger than 1TB. This is because the rebuild time is longer for larger HDDs, and there is a greater probability of failure or encountering another bad sector during the rebuild.

In addition, I set things up so my RAIDs get scrubbed about once a month. That makes it much less likely that a bad sector will be encountered during a rebuild, since hopefully any drives with bad sectors will get dropped within a month of the bad sector first occurring (during a scrub).

So I tend to agree with you that the whole TLER issue is less important than it first appears. Nevertheless, it is too bad that all HDD makers do not have a simple jumper or DIP switch that can limit the recovery time to 7 seconds. And I have never understood why the hardware RAID card manufacturers do not allow the user to set the timeout in seconds for dropping a drive.
 
Back
Top