Western Digital Time-Limited Error Recovery (TLER) with RAID and SE, SE16, GP Models

That would explain things then, with that stripe size. You're taxing it with a lot of calculations, plus the drives are writing things out in 16KB chunks so their speed will be lower (look at drive benchmarks for 16KB writes vs. 1MB or larger writes), so in this case you're being affected by spindle speed and possibly maxing out the processor.

I wrote to Adaptec support and that's what they've told me as well.
They recommended a stripe size of 256KB (this is what the controllers were optimized for).
They said their engineers worked hard to make the default values work as good as possible, so I should revert the controller's settings to default values (I know I did some changes as well). I will try also create another array using 256KB stripe size, and see what are my results in terms of performance, and I will report back. I should probably see an increase in performance.

How does affect the stripe size in terms of space used for small files?
For example if I'm writing a 8KB file, will it use the whole 256KB stripe, or some other files will be "squeezed" on the same stripe as well? How does this work?
The way I understood, it works like this:
Assuming I have a Raid5 array with 4 drives, 3 of them are used for data, and one for parity (actually the data and parity stripes are sitting on all 4 drivers, because they are rotating). For a file that is 768KB let's say.. it is divided in 3 chunks of 256KB and written on the 3 drives, then the 4th drive contains a sum of the 3 chunks written on the other 3 drives. For files larger than 768KB, they will be split in parts of 768KB each, and this will repeat. However, I wonder what happens with the remainder of the data (not all the files are dividing exactly at 768KB)? How is that data processed? Will it occupy a whole chunk of 256KB or that chunk will be "shared" with other files? If that's the case, I see no point in creating a small stripe for the array.. just create the largest one. So I assume the extra space is lost..
Anyone knows exactly how this works?
 
I wrote to Adaptec support and that's what they've told me as well.
They recommended a stripe size of 256KB (this is what the controllers were optimized for).
They said their engineers worked hard to make the default values work as good as possible, so I should revert the controller's settings to default values (I know I did some changes as well). I will try also create another array using 256KB stripe size, and see what are my results in terms of performance, and I will report back. I should probably see an increase in performance.

How does affect the stripe size in terms of space used for small files?
For example if I'm writing a 8KB file, will it use the whole 256KB stripe, or some other files will be "squeezed" on the same stripe as well? How does this work?
The way I understood, it works like this:
Assuming I have a Raid5 array with 4 drives, 3 of them are used for data, and one for parity (actually the data and parity stripes are sitting on all 4 drivers, because they are rotating). For a file that is 768KB let's say.. it is divided in 3 chunks of 256KB and written on the 3 drives, then the 4th drive contains a sum of the 3 chunks written on the other 3 drives. For files larger than 768KB, they will be split in parts of 768KB each, and this will repeat. However, I wonder what happens with the remainder of the data (not all the files are dividing exactly at 768KB)? How is that data processed? Will it occupy a whole chunk of 256KB or that chunk will be "shared" with other files? If that's the case, I see no point in creating a small stripe for the array.. just create the largest one. So I assume the extra space is lost..
Anyone knows exactly how this works?

The HDD space wasted by files that do not occupy completely the requisite cluster (the firsts are full, unlike last probably will not be) can not be reused by other files. Then wasted space is independent of the size of the stripe, but rather depends on that of the cluster.

For volumes with parity, the RAID controller performance may vary considerably under different stripe.

Do not forget, however, that the minimum unit addressable in a partition by the OS is the cluster and not the RAID stripe. The OS that converts the clusters in relative adjacent sectors.

The OS then ask to the controller sequences of reading and writing groups of sectors.

Let me give an example:
- I assume a partition with 4KB clusters (the default for NTFS partitions in Windows) which consists of 8 sectors 512B.
- The partition resides on a RAID5 logical volume consisting of 3 HDD.
- The stripe is larger than 4KB.

1) The OS reads a group of sectors from a healthy volume: the controller can only read the portion of the stripes that contains the sectors and return them to the OS.

2) The OS reads a group of sectors from a degraded volume: the controller has to read two full stripe (2 data or 1 data + parity), has to calculate any corrupt stripe, and returns the sectors to OS.

3) The OS writes a group of sectors to a healthy volume: the controller must read two full stripes (2 data), modify the stripe incorporating the given sectors, calculate the new parity and write three full stripes (2 data + 1 parity ).

4) The OS writes a group of sectors to a degraded volume: the controller must read two full stripes (2 data or 1 data + parity), calculate corrupt stripe from parity, modify the stripe incorporating the given sectors, calculate the new parity or data, and write two full stripe ( 2 data or 1 data + parity).

It 'clear then that a mere writing of a single sector (512Byte) may lead to the reading of two stripes, two parity calculation, and the rewriting of three stripes.

This suggests the use of small stripe, but there is a big problem: smaller stripes causes an increase of calculations and I/O requests.
 
Last edited:
Interesting post, Iron67 ... but may not be entirely true.
According to a guy named Tom Treadway, which seems to be working for Adaptec, the stripe size does not affect the usable size. That is, no space is wasted with small files if the stripe size is large.

Anyway, I hoped this help explain how stripe size affects performance. The bottom line is typically that the default stripe size is best, and the default is usually a big number, such as 256KB. If you want to play around with reducing stripe size, make sure you do plenty of real-world performance testing.

That’s correct. Smaller stripes sizes will definitely NOT affect drive space. Heck, the OS doesn’t even know that striping is occuring (assuming it’s happening on a hardware RAID card) so there’s no way that it could affect drive space. I suppose that someone like Microsoft could write RAID code that was tightly integrated into the filesystem, but as far as I know that hasn’t happened.

So, to your question, no, having a 4KB file (for example) residing on part of one stripe does not preclude a different file from residing on the remaining part of the stripe.

Here's the link:
http://storageadvisors.adaptec.com/2006/06/05/picking-the-right-stripe-size/

After reading the full article, it seems that most of the times bigger stripe is better, even for small files. So I decided to go for max stripe size that my controller allows: 1024k.
I should've tried 256K first, but I don't have that much time for tests, so I'm just gonna trust that guy, he seems to know what he's talking about :)
 
EDIT:

Tested today WD15EADS-00Sxxxx with production date 07 December 2009 and surprise...WDTLER IS WORKING AND ALSO WDIDLE3.

I enabled TLER and also set idle time to 300 seconds. After that I powered off the PC for 5 minutes and ran again WDTLER and WDIDLE3 and everything was as I set it.

So you can buy WD Green drives with 11th character "S". Maybe others also can test and reply if for them it is working also.
 
Last edited:
Hey there, has anybody tried the wdtler tool on a RE3 hd? I have the wd1002fbys. Anyone?

The whole point of the WDTLER tool was to disable TLER on enterprise drives - being able to enable it on consumer drives was a side benefit. I'm guessing it'll work.
 
Well, I just received an RMA for a Caviar Black WD1001FALS-00J7B (firmware 5.00K05) that developed SMART errors. It was one of a pair I couldn't get working with a Highpoint RAID card, and as soon as I tried to use it in my NAS I realized why.. the NAS had a SMART tool built in and gave a decent error!

Anyway, these were 4/2009 drives and I was able to enable TLER just fine (7 seconds). The replacement is 12/2009, WD1001FALS-00E3A (firmware 5.01D05) and I can confirm message #103 - they no longer respond to the WDTLER tool. I don't remember the exact message, but it was essentially 'TLER support not found'.

I wonder how my Synology NAS will do with a TLER 7 second drive and a non-TLER drive in a RAID 1? Guess I'll find out this weekend..

chris
 
^ it will work fine. TLER really only becomes an issue if and when a drive has to go into a deep error recovery cycle, remap bad sectors, etc. some perfectionists might argue that at that point the drive is less than perfect anyway and so why keep it around. however drives were designed for this and thus have a quantity of spare blocks available.

OTOH I have a Raid6 array with 20 x WD1001FALS all with TLER enabled, and one of them is developing bad sectors, not enough to get kicked out of array though. I'm hesitating to RMA it because I know the drive I'll get back will be a newer one with crippled firmware. In practice it shouldn't (and won't) make a difference but I'm one of those nuts that likes everything the same so I hear you. There's always ebay -- confirm with sellers what the manufacturing date of their drive is before bidding, or see if their photo shows the date.
 
The whole point of the WDTLER tool was to disable TLER on enterprise drives - being able to enable it on consumer drives was a side benefit. I'm guessing it'll work.

Why on earth would you want to do that?

Either I am missing something or I haven't explained it correctly.
I want to use the wd1002fbys (re3) as a stand-alone drive, aka non-RAID. So I need to disable TLER on that disk. And that is why I am asking.
 
It doesn't matter. The drive will work just fine with it enabled.

... but yes, you would be better off disabling it so it does go through the motions of error recovery if and when it is needed. :p Contact WD if the util doesn't work and see if they can give you something that does.
 
It doesn't matter. The drive will work just fine with it enabled.
Well, it will definitely work, but is this ok for the long run? I did a lot of reading online, and most sites suggest TLER should be disabled for non-RAID use.
... but yes, you would be better off disabling it so it does go through the motions of error recovery if and when it is needed. :p Contact WD if the util doesn't work and see if they can give you something that does.
Whether the tool will work or not is something that concerns me (I 'll try it in a couple of days, I 'll get back to you), but after searching wikipedia, I quote this:

WD also states that using the WDTLER.EXE tool on newer drives can damage the firmware and make the disk unusable.

Maybe it's for marketing reasons, maybe not.
That's why I am asking if anyone has had past experience on disabling TLER on a RE3 device, and if I should do it or not.
What the heck, I 'll try it, and if it fails, I 'll RMA it and play dumb.
 
You *should* be able to change TLER on an RE3 just fine but I have no experience with that drive. No point in buying enterprise drives for home use, and even at work I question it since the enterprise drives fail more often than the desktop versions in our arrays.

As for WD's statement about WDTLER on newer drives "damaging the firmware and making disk unusable" = utter bullshit.
 
... Whether the tool will work or not is something that concerns me (I 'll try it in a couple of days, I 'll get back to you), but after searching wikipedia, I quote this: ...

Maybe it's for marketing reasons, maybe not. That's why I am asking if anyone has had past experience on disabling TLER on a RE3 device, and if I should do it or not. What the heck, I 'll try it, and if it fails, I 'll RMA it and play dumb.

As already mentioned, the tool was specifically distributed for your type of situation (disabling TLER on RE drives to be used in standalone scenarios). As for wikipedia, that note was most likely added after reports were found online that the tool no longer worked for the opposite of its intention (enabling it on consumer drives).

... No point in buying enterprise drives for home use, and even at work I question it since the enterprise drives fail more often than the desktop versions in our arrays. ...

But your arrays are most likely on 24/7 and the drive is most likely in a hotter environment due to the other drives in the array. I'm not sure a consumer level drive would last longer in such environment, but I doubt it would. I have an array of 10 'cudas that need replacing more often than my other array with 10 RE3 drives (no replacements needed after a year)... needless to say I've been replacing the 'cudas with ES drives and have had less problems with those lately. I know, this is anectodal, but just wanted to share my experience as you did. ;)
 
replace girlfriend and press any key to continue.

lol :D

You *should* be able to change TLER on an RE3 just fine but I have no experience with that drive. No point in buying enterprise drives for home use, and even at work I question it since the enterprise drives fail more often than the desktop versions in our arrays.

As for WD's statement about WDTLER on newer drives "damaging the firmware and making disk unusable" = utter bullshit.
As already mentioned, the tool was specifically distributed for your type of situation (disabling TLER on RE drives to be used in standalone scenarios). As for wikipedia, that note was most likely added after reports were found online that the tool no longer worked for the opposite of its intention (enabling it on consumer drives).

Hey everybody, just a little update. I finally got my hands on the wd1002fbys (RE3). Great HD. Installed, formatted, and used the WDTLER tool successfully to change from default 7/7 to 0/0. No side effects till now (firmware failure? lol) though WD support said that sth like that would void the warranty, denying twice to send me the tool.

Thank you all for helping! I 'll keep you updated if anything comes up.
 
has anyone ran tler in a 2tb wd? the w20eads model?

i got 6 and raid keeps dropping a disk after writing excessive data. :(
 
I just did several WD20EADS, model WD20EADS-32RB0 and WD20EADS-00RB0, date codes are Oct 2009 and Nov 2009 respectively... also did 4 older WD20EADS-00RB0's that I do not have the date code in front of me...

all the drives worked fine enabling 7 sec TLDR using the WD TLDR utility from the first page on my bootable thumbdrive that I use for Ghost
 
Hey everyone, the key to having the WD 1TB Black in a raid array is to look @the dcm number underneath the manufacturer date of the drive. I have tried over a dozen of 1 TB black drives in the last couple of weeks. I have drives that were manufactured from various dates, oct09, nov09, and dec09. The drives that work were with model# "WD1001FALS-00E8B0" with dcm# ending with "AB" or "CB". Would people please confirm? Thanks a lot.
 
Ok guys, got another couple of new batches to try today. WD1001FALS-00U9B0

DATE: Dec16
DCM: HBNNHT2CB
and
DATE: DEC09
DCM: HBNNHV2CB

They both came with tler disabled. Turn tler on without problems. I suggest people buy these drives locally and with an exchange policy.

Hope this help.
 
So this essentially turns a 1TB Black in a 1TB RE4?
Or is there still a few other RAID optimisation features unique to the RE3/4's?
Man it'd be so sweet if one could buy 1 or 2TB and flash it to a full RE4!
 
Ok guys, got another couple of new batches to try today. WD1001FALS-00U9B0

DATE: Dec16
DCM: HBNNHT2CB
and
DATE: DEC09
DCM: HBNNHV2CB

They both came with tler disabled. Turn tler on without problems. I suggest people buy these drives locally and with an exchange policy.

Hope this help.

Which firmware level did they come?

I've just purchased four WD1001FALS with firmware level 05.00K05 and TLER can't be enabled.
 
I don't think it's an issue of firmware- I have 20 x WD1001FALS, purchased September 2008, all of them with firmware 05.00K05, and I was able to enable TLER on all of them.
 
Last edited:
So this essentially turns a 1TB Black in a 1TB RE4?
Or is there still a few other RAID optimisation features unique to the RE3/4's?
Man it'd be so sweet if one could buy 1 or 2TB and flash it to a full RE4!

For those who are curious no it doesn't...
TLER is but one function unique to near-line (enterprise class) drives.
 
Just received a
WD20EADS-00S2B0 with date code 06 JAN 2010
TLER set to 7 seconds, survives a power cycle.
 
I have a PERC5i (LSI Megaraid 8480) on the way and I'm now drive shopping, I'm glad I ran into this thread because I thought that none of the new WD drives supported the use of the wdtler utility now, but I guess that's incorrect. However I have another question for the disk/raid gurus here, I've read the this card allows the user to manually adjust the timeout for flagging drives as faulty, so can I just buy consumer drives and adjust the controller timeout accordingly?

I would just buy the WD 1TBs listed above, but I'm curious about the timeout on the controller because I already have a hitachi 1tb I could use and microcenter has the same drives for $80 which would bring down my total cost significantly @ 8 drives.
 
So is there a fix for using the WD 1 TB Black Cavier drives in RAID?
I bought 3 recently and only one allows TLER to be turned on.

Is there a raid controller on the market that will control the time out function?
 
Quick note for the *smart* people using linux software raid (mdadm), you do not need to toy with TLER on your drives :)

"In fact, md does not timeout disks as many Hardware RAID controllers do.
So, from md's point of view, TLER is useless, i.e. it has no benefit."

Link: http://www.issociate.de/board/post/498851/mdadm_and_TLER_%28Time_Limited_Error_Recovery%29.html

As stated in the thread, worst that will happen is the array will lock up during the fix, but at least the drive will do its best to recover.
 
That is the problem though,

I am running Windows 7 64bit on my rig

With onboard Raid control with a EVGA X58 mobo
 
I think onboard is okay, it's still softraid and the drives are still responsible for any error recover.. no?

My problem is that my array is for a whitebox ESX build, I went with hitachi's and the controller arrived today so hopefully I wont have any big problems...
 
I think onboard is okay, it's still softraid and the drives are still responsible for any error recover.. no?
Yes, I would like to get this clarified also. I have a couple WD Black drives on the way to run Raid 0 or Raid MATRIX, not sure yet.
Reading over the TLER Wiki, it's stated:

"Intel Matrix Raid embedded in Intel server motherboards and modern desktop motherboards is a pseudo-hardware controller, not real hardware raid."

And that's all that is mentioned, nothing about if TLER is still needed or not.

So, is it not necessary to turn on TLER if using the onboard Raid controller/s?
I'm running an MSI GD-80 motherboard.

TIA
 
Last edited:
not sure it would matter in raid0 anyway, there's nothing to recover from.

But the problem is the drives drop out of the array and require a reboot to be seen again.Then they are not in RAID 0 anymore and you have to set it back up.
 
Back
Top