Are 2TB hard drives reliable?

BER = Bit Error Rate; density is not the problem so much as the BER staying static while density has been increasing. There's a risk, with all drives, that the data you wrote to the disk isn't the same as the data that comes off the disk, and that's why they have ECC etc to mitigate this. BER hasn't been improving inline with capacity, so on a large drive, there is a greater risk that some of your data is bad. And your OS may not notice. I guess better error detection and correction is expensive to implement; you need faster, more complex processors on the drives and/or filesystem-level protection (like ZFS or the stuff that's going into Vail). But generally SAS enterprise drives have BERs that are an order of magnitude better than SATA desktop stuff. You pays your money...

If all you've got on your drive is video, this doesn't matter - media players can cope with corrupt files. If it's something more important than that, then it's much more serious. Sure, backups are important, but having to restore from backup costs time and money, and you shouldn't have to do this just because your drive has unfit for purpose error correction. And what happens if you backup a corrupted file?
 
That's why you would want 4K sectors; to have better protection against BER.

As you stated, BER has remained roughly the same over the years; while capacities grew. That was one reason why they opted for 4K sectors; to allow more ECC room and thus better protection against BER.

This doesn't translate to lower (U)BER specs, though. The UBER specs on 512-byte sector and 4K sector drives are specified the same; but those specs are more like marketing: firmware enhanced RAID edition drives would develop 10 times less bad sectors if you believe the specs; even though they are physically the same as non-RE drives.

Could you explain to me: "(like ZFS or the stuff that's going into Vail)" ? Is Vail getting a new filesystem with checksumming? I would find that highly unlikely; Microsoft stopped innovating storage a long time ago and WinFS is permanently cancelled afaik.
 
Could you explain to me: "(like ZFS or the stuff that's going into Vail)" ? Is Vail getting a new filesystem with checksumming? I would find that highly unlikely; Microsoft stopped innovating storage a long time ago and WinFS is permanently cancelled afaik.

IIRC 64 bit sector hash. 1 and 2 bit errors. That's why there's quite a bit of overhead involved in storing stuff on Vail v. WHS V1 (like 10%).
 
Thanks for the information, and advice, but my budget is really only around $100. Looks like that constellation is upwards of $300.

I will be storing HD movies, game ISO's and ROMS, and music mostly. All my important things, I keep backed up, and isn't anywhere as demanding in size, so I can store it on a flash drive or a few dvds if need be =p
 
Good to know that 4K improves resliency.

Vail: in addition to mirroring (now blocks, not files) it uses up extra space (12% I think) to provide better error correction. Lots of people complaining about this, but basically MS claim they put it in because they had lots of silent data corruption problems coming into support from WHS users who were using cheap desktop drives. WHS v1 just duplicates files, so instead of one corrupt file on one disk, you'd have two corrupt files on two disks. Downside (apart from loosing more capacity than mirroring) is that you loose the ability to pull a drive from Vail and read it from anything that can do NTFS.
 
Very interesting! Is this some sort of layer that sits on NTFS? I would need to read about Vail; but it looks like they indeed addressed the corruption issues. Good news for those interested in WHS i'd say!
 
Steer clear of Seagate Barracuda LP drives. Mine are dying left and right. I have 4 and 2 have gone to RMA and they are less than a year old.

My experience with the LPs has been great. I've had 6 in a RAID5 for about a year now and I recently added two more to the array about two months ago. Not everybody has had a bad experience with the LPs as some people would have others think.
 
Does WD offer advanced RMA, where if a drive is faulty they'll ship you the new one first? Does Samsung?
 
Does WD offer advanced RMA, where if a drive is faulty they'll ship you the new one first?

Yes. I believe they take your credit card # so if you do not return the drive they charge you "retail" price for the advance RMA.
 
I've got a few more noob questions, I figure I may as well add them to my already existing thread instead of making a new one.

1. how come during the hard drive benchmarks, they constantly get slower? is it because it's reading from the outside of the disk to the inside or vice versa? will the discs "real world" transfer rate be more like the benchmark results minimum, or average?

2. I've read about 4k sectors being good. i don't really know what that means but I notice the F4 says it emulates them. Is that bad? do I need to anything particular during my format to use 4k sectors or set up anything in windows?

3. I understand read/write speeds (who couldn't =p) but what about burst and access time?

thanks for all the help so far you've been very supportive!
 
1. how come during the hard drive benchmarks, they constantly get slower? is it because it's reading from the outside of the disk to the inside or vice versa? will the discs "real world" transfer rate be more like the benchmark results minimum, or average?

Reading from the outer tracks are the fastest. Each successive track is a little slower as you go from outer tracks to inner tracks. In the real world you try to put the stuff you want fastest on the outer tracks. The OS goes first. Large files that are backups can be put at the end of your data to move other files that need to be fast to the outer tracks.. Defragging with a program like mydefrag can help by optimizing what files go to what locations of the disk.

http://www.mydefrag.com/

2. I've read about 4k sectors being good. i don't really know what that means but I notice the F4 says it emulates them. Is that bad? do I need to anything particular during my format to use 4k sectors or set up anything in windows?

On the media the sectors are actually 4K but the firmware returns all sectors a 512 bytes inorder for the drives to work with XP and older operating systems that will not understand drives with sector sizes that are not 512 bytes. This can be a negative to systems that already support larger sector sizes. One such negative is in linux. The os and tools suppport 4K sectors but the main problem is the drive tells the OS and applications that the sectors are 512 so fdisk (partition creating tool) will misalign the partitions by default. The reason why Vista / Windows 7 does not also misalign is these operating systems default the first partition on sector 2048 which is divisible by 8. All the bit about misalignment with XP and older windows operating systems started the first partition on sector 63 (not divisible by 8). This causes performance problems because then a lot of small reads and writes will require 2 4K sectors to be updated instead of 1 if the data was properly aligned.

With all this said. all 4K drives today use this same emulation.

3. I understand read/write speeds (who couldn't =p) but what about burst and access time?
Burst is the speed the drive reads from the ram cache that is on the drive itself.
Access time is a measure of random seek time.
 
Last edited:
Thanks for the information =)

How can you choose where your information is stored though, be it the outer track or inner track? Is doing a regular defrag just about your best bet at "putting things in place". I bet then SSD's have a more constant read/write rate (on top of them being faster) because of it's lack of "tracks" and "disks"

Would making 3 partitions of 667GB speed up the disk at all (being platter size) or make no difference?

thanks!
 
I finally got around to ordering the drive. $89.99 shipped from newegg, WOW.

When I get the drive, what are some of the first things I should do?

I'm guessing format is #1 (is windows good enough for that or should I use a program? and is there any downside to choosing "quick" format?)

Once its formatted I should open up HDtune, check SMART, and run an error scan?
 
Some will say a long format is unnecessary but I like to do it as it's more thorough and you're more likely to discover any problems that the drive may have.

I did the long format on all 8 of my Samsung 2TB drives. It took a LONG time (3-4 hours per drive) but it's not like I needed to use them immediately. I then run CrystakDiskInfo and check the health, reallocated sector count, etc. just to make sure everything looks good.

I used Windows 7's disk management console to format mine. Does the job just fine.
 
I did the long format on all 8 of my Samsung 2TB drives. It took a LONG time (3-4 hours per drive) but it's not like I needed to use them immediately. I then run CrystakDiskInfo and check the health, reallocated sector count, etc. just to make sure everything looks good.

I recommend this. It takes a lot longer but then you know the drive can read and write all sectors without problems.. This is also way better than manufacturer long tests that do not check every single sector and tests that only read each sector.
 
Might be good to stress test the HDDs. Google study suggests some models fail prematurely if loaded significantly; but this causes failure rates to drop after 6 months. This makes me believe you can 'weed out duds' as it were; disks that have a built-in weakness and just waiting to fail. Stressing those disks for a few weeks may be enough to kill themselves.

The only difference would be that they fail in the stress test period, instead of when you taken these disks into use.
 
Use MBR if you do not plan on using the drive on windows XP or some other old OS. Use GPT if you will only use the drive on Vista+ or a current version of linux.
 
OK I don't plan on ever going back to xp, but i do have a patriot box office and netbook with windows xp that often will access the drive through my network.

will gpt cause any issues with this? due its limited compatibility? I also am running windows 7 32 bit, if that matters.
 
It can only cause an issue if you want to plug the drive directly into the device either by a SATA port or a usb external enclosure. It has no effect on sharing files from the drive over the network.
 
ok great, thanks for the super fast reply. Just out of curiosity, what are the advantages of GPT over MBR? Asides from larger partitions.

thanks a lot!

GAh, more questions. I want to make surte I do this right =)

http://img229.imageshack.us/img229/8909/harddrive2.png

Should I select default allocation unit size? 512? or 4096? (i believe this has to do with the 4k sectors). Drive will be mostly used for storage of files like game ISO's, HD video, mp3s, pictures, maybe documents

I think im right in unchecking quick format and file and folder compression, please correct me if im wrong.

thanks a ton
 
Last edited:
ok great, thanks for the super fast reply. Just out of curiosity, what are the advantages of GPT over MBR? Asides from larger partitions.

Larger partitions. No need for extended partitions. And since a UUID is stored in the GPT for each partition windows can find identify the partition quicker.

Should I select default allocation unit size? 512? or 4096?
The default is 4096. Go with that.

I think im right in unchecking quick format and file and folder compression, please correct me if im wrong.

That is good. The full format will take 3 to 4 hours. Check the SMART after the format with a program like CrystalDiskInfo.
 
Last edited:
So I've been running a lot of tests and I'm not sure what to think. Smart seems normal, except a value of 1 on the g-sense error rate and 335 program fail count.

Read speeds look about right for the drive, but the graph that HDTune produces is very wild. With sharp spikes sometimes, with varying intensities.

http://img51.imageshack.us/i/samsung2tbspeeds.png/
http://img26.imageshack.us/i/smartdata.png/

During the sharp drops I hear my system drive "thinking". I closed all programs though and disabled the antivirus. I know it's my main drive making the sound because it's the loud crunching sound my WD black always makes and I'm very familiar with it.

I did some more tests. It looks like the irregularities are getting worse. My other drives don't jump and down and like this one..arggg this is pissing me off

http://img51.imageshack.us/img51/8924/hddsamsung2tbwtf.png

I added about 50gb to the drive, it was empty before. I did this to see if it made any change in the graph, and it did. The spikes are more extreme, but at least it shows consistency. I also noticed that on all the benchmarks with my new drive, cpu usage is about 5-6% higher than on any of my other ones. weird.

http://img526.imageshack.us/img526/6522/50gb.png

What do you think, does this warrant an RMA? I honestly have no idea.
 
Last edited:
What do you think, does this warrant an RMA? I honestly have no idea.

I am not sure about the spikes. But CrystalDiskInfo looks good.

Do you have indexing on the drive? Is your antivirus program running?
 
I turned everything off including antivirus, and I'm pretty sure indexing is on all my drives. none of my other drives give me the spike problem though, which is why im concerned.
 
Is the smart data still the same? I mean no reallocated sectors? No pending or off line uncorrectable sectors?
 
Just tested it again, it's back to the spikes.

So what do you think? Newegg just sold out, so I may have to wait if I want to RMA. Think I should? Or just keep it and check smart every now and then?
 
I'm not sure how to do that. Someone over at toms hardware asked me to disable smart and try it, but don't know how to do that either =p

program fail count is up to 2145 now. from 385 when I first booted up, not sure if its significant
 
I have a nonzero program fail count on my drives:

Code:
jmd0 ~ # smartctl --all /dev/sde
smartctl 5.40 2010-10-16 r3189 [x86_64-pc-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:     SAMSUNG HD204UI
Serial Number:    S2HGJDWZ806049
Firmware Version: 1AQ10001
User Capacity:    2,000,398,934,016 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 6
Local Time is:    Tue Nov  2 10:56:42 2010 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                 (21300) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 255) minutes.
SCT capabilities:              (0x003f) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   100   051    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0026   252   252   000    Old_age   Always       -       0
  3 Spin_Up_Time            0x0023   077   067   025    Pre-fail  Always       -       7164
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       18
  5 Reallocated_Sector_Ct   0x0033   252   252   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   252   252   051    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0024   252   252   015    Old_age   Offline      -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       1100
 10 Spin_Retry_Count        0x0032   252   252   051    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   252   252   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       26
181 Program_Fail_Cnt_Total  0x0022   100   100   000    Old_age   Always       -       2322
191 G-Sense_Error_Rate      0x0022   100   100   000    Old_age   Always       -       66
192 Power-Off_Retract_Count 0x0022   252   252   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0002   064   059   000    Old_age   Always       -       30 (Min/Max 21/42)
195 Hardware_ECC_Recovered  0x003a   100   100   000    Old_age   Always       -       0
196 Reallocated_Event_Count 0x0032   252   252   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   252   252   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   252   252   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0036   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       3
223 Load_Retry_Count        0x0032   252   252   000    Old_age   Always       -       0
225 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       13

I am not sure what that means however. Although my drive has been on for 1100 hours.

Hmm. This is 0 on a drive that has been on 688 hours.

Code:
jmd1 ~ # smartctl --all /dev/sda
smartctl 5.40 2010-10-16 r3189 [x86_64-pc-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:     SAMSUNG HD204UI
Serial Number:    S2HGJ1BZ836643
Firmware Version: 1AQ10001
User Capacity:    2,000,398,934,016 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 6
Local Time is:    Tue Nov  2 11:00:45 2010 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                 (21060) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 255) minutes.
SCT capabilities:              (0x003f) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   100   051    Pre-fail  Always       -       24
  2 Throughput_Performance  0x0026   056   056   000    Old_age   Always       -       19168
  3 Spin_Up_Time            0x0023   078   068   025    Pre-fail  Always       -       6823
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       16
  5 Reallocated_Sector_Ct   0x0033   252   252   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   252   252   051    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0024   252   252   015    Old_age   Offline      -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       688
 10 Spin_Retry_Count        0x0032   252   252   051    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   252   252   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       13
181 Program_Fail_Cnt_Total  0x0022   252   252   000    Old_age   Always       -       0
191 G-Sense_Error_Rate      0x0022   100   100   000    Old_age   Always       -       144
192 Power-Off_Retract_Count 0x0022   252   252   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0002   064   055   000    Old_age   Always       -       34 (Min/Max 22/45)
195 Hardware_ECC_Recovered  0x003a   100   100   000    Old_age   Always       -       0
196 Reallocated_Event_Count 0x0032   252   252   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   252   252   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   252   252   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0036   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       3
223 Load_Retry_Count        0x0032   252   252   000    Old_age   Always       -       0
225 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       16

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%       138         -

Note: selective self-test log revision number (0) not 1 implies that no selective self-test has ever been run
SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Completed [00% left] (0-65535)
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
 
http://kb.acronis.com/content/9183

It looks like normal aging, or possible a hardware problem...mine seems to rising rapidly, nearly exceeding yours at 1100 hours vs. 11.

I wonder if that has something to with it. I should probably just RMA the thing
 
http://img210.imageshack.us/img210/3378/cableswap.png

OK I swapped cables with another drive of mine that gave great graphs, booted up in safe mode, and ran it 5 times in a row. Again, it starts out great, and gets more erratic around the 3rd run, though it does seem less severe using different cables.

So can someone who really knows their stuff give me an rma recommendation? safe mode results are good, but the more i run it, the worse they get. They still aren't that bad though, at least compared to my graphs in regular win7, which are terrible. and then theres program fail count, which is at 2167 and rising, but I'm not sure how significant that is. If it matters, I formatted the drive as UPT rather than MBR.

thanks!
 
I just went ahead an ordered another drive. I'm going to RMA this one, program error count skyrocketed to 20000 last night.

I'm going to have 2 F4's now so I'm probably going to replace my WD FAALS1001 with an f4 as my boot. I've read that partitioning drives is bad for them and slows them down, any truth to that? I've always have my FAALS partitioned for 230gb system and the rest for data.
 
I do that at work on all new installs. Because this keeps the changing data from fragmenting the operating system. We are a research team that each member creates and deletes thousands of files daily. I certainly do that myself as a programmer when I build my 100 to 300 thousand line C++ programs. It also makes it easier to backup since I can ignore the operating system and just backup the data.

Partitioning can slow things down if the first partitions are mostly empty because the disk is faster on the outer tracks. It can also slow things down if you routinely use stuff from the first and last partitions at the same time such that the heads have a long way to go.
 
Last edited:
Back
Top