What do you do to test new hard drives?

Megalith

24-bit/48kHz
Staff member
Joined
Aug 20, 2006
Messages
13,000
I like the idea of a SMART short self-test and a full sequential surface (read) test. This took about 10 hours for my new 5TB drives, and I am thinking this is thorough enough to root out any potential issues.

Do you think this is enough, or should new drives be punished further with a full write test?
 
I like to write random data to the drive and never return to that sector again. It's the only way to be sure.
 
I do a full 4 pass badblocks on every single hard drive (no different between New and RMA drives) I get at home, work or for customers. For 5TB drives this will take 60+ hours.
 
Last edited:
I use HD sentinel and i do 2 full passes of read/write test and that takes about 30-40 hours (3TB drive). I don't know if more is really required but 2 seems decent to me. So far 13 drives have passed without issue.

BTW i love HD Sentinel and i am a major fan. I bought 5 copies for my various computers. I love the logs in it. Its also super barney style too so even my dad can use it and understand it. It could use a little more breakdown so its even dumber but its solid. As in the FAQ could be better and pop up descriptions could be more prevalent but its really impressive to me.
 
Since the drives have been meticulously tested at the factory, I plug it in and start using it
 
Before I do anything with the drive, I first note the raw value of Reallocated Sector Count in SMART. I then do a full format and check the RSC again. If it hasn't increased, I just start using it normally. If RSC has increased, I'll decide then what to do, depending on how much it increased.
 
i assume he is trolling.......
Before I do anything with the drive, I first note the raw value of Reallocated Sector Count in SMART. I then do a full format and check the RSC again. If it hasn't increased, I just start using it normally. If RSC has increased, I'll decide then what to do, depending on how much it increased.

if your lazy HD Sentinel keeps the records for you :D its nice when your lazy and you have like 20 HDDs floating around or more.
 
stablebit scanner (which I got for free via their twitter giveaways). I use it to fully scan the entire drive surface.
 

so.. you really think that they just slap that stuff together and ship it? lol I worked in PC production for a long time and I can assure you every piece is tested before it leaves...

plug it in and turn it on, if you make it the first 4 hours, chances are you will get plenty of use out of it...
 
so.. you really think that they just slap that stuff together and ship it? lol I worked in PC production for a long time and I can assure you every piece is tested before it leaves...

plug it in and turn it on, if you make it the first 4 hours, chances are you will get plenty of use out of it...

You win the ignorance of the year award. :rolleyes:

Because we all know shipping has 0 affect on hard drives.

You can "chance" it yourself if that's what you want to do but your advice is ridiculous.

PS: I had not said "what I think" about how they "slap" anything... so you can stop trying to put words in my mouth.
 
yeah because manufacturers like WD never over engineer their packaging for retail drives nor do they not design their drives to take some abuse during shipping by parking the heads off of the platter. talk about being silly....

every single drive I have ever purchased has never suffered a failure due to shipping and I have received both well packaged retail and bulk packaged drives that were not properly secured in their shipping boxes... (thank you new egg)
 
At work I have had several dozen DOA drives however I have purchased several hundred over my ~19 years on the job and I have also done 75 or so RMAs.
 
At work I have had several dozen DOA drives however I have purchased several hundred over my ~19 years on the job and I have also done 75 or so RMAs.


Agreed. Out of all the computer parts I've worked with in my career, HDDs are the ones I've had to RMA the most. It makes sense to be honest as they are one of the few moving parts in a PC.
 
Agreed. Out of all the computer parts I've worked with in my career, HDDs are the ones I've had to RMA the most. It makes sense to be honest as they are one of the few moving parts in a PC.

i would say PSUs are second :/

but yea...where is yeu coming from...I seriously thought he was trolling and being sarcastic.
 
I run bad blocks and compare the pre and post results. I don't think you are supposed to do that with ssd's though. Not sure what a good test for them is.
 
Nothing. I just use them. If they fail within' a few hours then big deal. I lose the most likely Windows install and programs install. If it lasts longer then it will most likely last a good while.
 
Does anyone know of a ZFS/NAS solution like FreeNAS, ZFSguru, Rockstor, Napp-it, etc that has hard drive test/break in features included in it?
 
Does anyone know of a ZFS/NAS solution like FreeNAS, ZFSguru, Rockstor, Napp-it, etc that has hard drive test/break in features included in it?

i dont think any do. Just some of us use separate programs for it. (mentioned above)
 
Here's what I do in Linux [or Windows]. Any switches are for smartctl unless otherwise noted. On Windows, instead of the CLI, you can use the freeware GUI hddguardian. The drive used for examples is clean with no bad sectors, but extended statistics/errors reveal "non-fatal" errors.

  1. If not a sealed retail drive, ensure SMART values are all good to begin with (-A) EXAMPLE
  2. Run SMART short test followed by SMART conveyance test (if supported) -- few minutes total. (-t short, -t conveyance)
  3. Some drives will only update some SMART attributes if offline data collection is enabled, so turn it on (-o on) and run an offline test (a minute or two, -t offline).
  4. Verify drive passed tests (-l xselftest,selftest). Note SMART values for future reference.
  5. Write all zeros to drive with dd (use a large blocksize) [or go to Disk Management, initialize drive, create a full-size partition and format BUT uncheck the quick box]
  6. During this run, verify the drive temperature stays OK. Most new drives will also provide a temperature statistics and/or a log of recent temperatures (-l scttemp), although the logging period can vary from one minute to one hour. EXAMPLE
  7. After the run, the drive will have reallocated any initially bad sectors. Compare SMART values to initial reference, checking for (pending) bad sectors. Also check the drive's error log (-l xerror,error) to see if the drive is reporting any non-fatal errors NOT resulting in a bad sector. EXAMPLE
  8. Run a full badblocks read/write test (badblocks -wsv -b 65536), which involves 4 full read and 4 full write passes, can take tens of hours for large 4TB+ drives. If you don't have that much time, use nwipe (clone of DBAN for running systems) to do one full random write and verify (nwipe -m random --verify=all -r 1 --noblank). [on Windows, the freeware tool h2testw will do one full random write and verify pass -- make a folder on the test drive and give that to h2testw so it doesn't ask for admin rights. Alternatively, use a paid tool like HDSentinel to do four write+read(verify) passes.]
  9. Note that the badblocks patterns of 0xaa (10101010), 0x55 (01010101), 0xff (11111111) and 0x00 (0000000) are from the days when "weak" sectors on hard drives could flip bits. Modern drives don't do this -- its the number of passes that matters, not the data.
  10. Check SMART values and the error log. Some drives also support extended device statistics (-l devstat) which can show you interesting statistics not in the standard SMART dataset, like average temperature and some non-fatal errors. EXAMPLE
  11. At this point, if all is well, the drive surface has proved to be good via multiple sequential read/writes. However, I also like to stress test the drive mechanism (actuator/heads) via random reads/writes. In Linux, use fio to perform multiple concurrent small reads/writes at random locations across the entire disk surface for at least two hours. [On Windows, use the free Microsoft utility diskspd to do the same thing.] This test will max out the drive temps, so make sure it's properly cooled. EXAMPLE of another drive with extremely low hours that passed all the previous steps but started failing after fio -- note the Seek Error Rate; first 4 hex are errors, next 10 hex are total seeks.
  12. Check SMART values, keeping an eye out for any seek error statistics, and the device error log.
 
Last edited:
I usually put the drive in the slot it will be used in but before adding it to the raid array I do a dd write and then a dd read of the whole drive. I check dmesg for any weird kernel errors. If I start getting IO type errors then the drive is RMAed. I also check smart to see if any of the error counts change. Though I've been told that with some drives the numbers are normal. Still makes me uneasy if I see anything but 0 in the error rows though.

Once I feel the drive is safe to use I then add it to the raid array and let it rebuild. If I'm building a new raid array I will do this process with all the drives at same time then build the raid array after.

I suppose running the smart tests after would be a good idea too, I never think about that.
 
Maybe I've just been lucky. Hundreds of drives, never had a DOA or a failure in the first few months.

Perhaps. Certainly makes me question the lengthy gyrations that some people apparently put themselves through.

IME, the reason to extensively "burn-in" new drives is more to weed out the "weaker" specimens that might fail after more than a few months (or even 1-2 years) of normal use.

Of course, this is for consumer drives usually bought bare from e-tailers one or two at a time. If I could afford crates (10 or 20 packs) of enterprise drives, I would be happy with a much shorter series of tests run on the array instead of individual drives.
 
You think you can detect drives that may fail after one or two years of use?

LOL!

For drives that will be lightly-loaded and accessed mostly sequentially in normal use, often, yes.

This drive passed badblocks and only began showing seek errors after at least 1 million seeks during the fio test. Note the low power-on hours, single bad sector and general lack of other issues. I suspect in "normal" use the drive would have done just fine until it either went past a few million seeks and/or experienced a prolonged relative rise in operating temperature (the two things fio does in a very short period of time)

RDhGEyS.png
 
Perhaps. Certainly makes me question the lengthy gyrations that some people apparently put themselves through.

I started my testing procedure when we used to purchase drives at newegg for work. When we did that every single order (usually 3 to 10 drives but sometimes more) came back with at least 1 DOA drive. This was 7 or so years ago. Now we can only purchase at cdw there has been much fewer DOA drives however there still have been a handful. Also we had a run of DOA RMAs from Seagate when they shipped out of Mcallen TX. No bad ones yet from Compton CA although nearly all of the Compton RMAs were Enterprise drives since we avoid desktop Seagate drives and the 7200.Xs are all out of warranty.
 
I started my testing procedure when we used to purchase drives at newegg for work. When we did that every single order (usually 3 to 10 drives but sometimes more) came back with at least 1 DOA drive. This was 7 or so years ago. Now we can only purchase at cdw there has been much fewer DOA drives however there still have been a handful. Also we had a run of DOA RMAs from Seagate when they shipped out of Mcallen TX. No bad ones yet from Compton CA although nearly all of the Compton RMAs were Enterprise drives since we avoid desktop Seagate drives and the 7200.Xs are all out of warranty.

let me guess they were not retail drives?...its newegg barebone are trash because of shipping.
 
The newegg ones were not shipped very well and not in retail boxes. The cdw and Seagate RMAs shipped well but also not in retail boxes but a very similar design.
 
Last edited:
The newegg ones were not shipped very well and not in retail boxes. The cdw and Seagate RMAs shipped well but also not in retail boxes but a very similar design.

i am surprised that companies let newegg ship with such shitty packaging considering they cost those companies so much money
 
Before I do anything with the drive, I first note the raw value of Reallocated Sector Count in SMART. I then do a full format and check the RSC again. If it hasn't increased, I just start using it normally. If RSC has increased, I'll decide then what to do, depending on how much it increased.
That's a nice but short version. I hope you're also noting current pending sectors, offline uncorrectable and UDMA CRC. If so, it's good enough!

At work I have had several dozen DOA drives however I have purchased several hundred over my ~19 years on the job and I have also done 75 or so RMAs.
This is what I adhere to and came to refer to it as the drescherjm protocol :D
However I sprinkle in a few full power cycles, as sometimes these cause SMART values to change.
 
i am surprised that companies let newegg ship with such shitty packaging considering they cost those companies so much money

Agreed. In the end I expect the manufacturers paid for neweggs poor shipping.
 
Back
Top