Questions about Raid "errors"

dkwolf

n00b
Joined
Aug 3, 2011
Messages
3
Hey there [H]ocp ppl :)

Just upgraded my server (with onboard) raid from a 2x2Tb Samsung F4EG drives to a used Adaptec 3805 controller with 6x Samsung 2Tb F4EG drives. But when i check the logs for the controller i see alot of "aborted commands". The system is running fine, but should i be worried about it?
I dont really want to cough up the 100$ i have to pay for support at Adaptec (the controller is like 3 years old now so outside their free support).

4 of my F4EG drives are brand new - made 06/2011 and the last 2 drives are from 10/2010 - the old drives have been firmware upgraded. I have also tried to upgrade the new once, just to be sure about it.

<snip from logs>
State...........................Optimal
S.M.A.R.T. error................No
Write-cache mode................Write back
Hardware errors.................0
Medium errors...................0
Parity errors...................0
Link failures...................0
Aborted commands................1049
S.M.A.R.T. warnings.............0
NCQ status......................Enabled

I get no hardware errors or other errors just aborted commands. The ~1000 i got there is from running like 2 days.

So if any one out there that have some advice i would appreciate it.
 
I dont want to hijack the thread, but want to give it a bump.

I got the same problem

Except with a 3805 RAID card and the Seagate 2TB HDD ST2000DL003
Are these Aborted Commands something to be concerned about?
 
It most likely has to do with the firmware on your drives and on your card. Make sure you upgrade both. Here is an adaptec page with some info.

http://ask.adaptec.com/scripts/adaptec_itic.cfg/php.exe/enduser/std_adp.php?p_faqid=15355

There are some drives which kick up these warnings. Unfortunately, I don't believe your drive is listed as supported. I have seen these problems on Samsung drives before, and they are more insidious than they appear. While you may not have any physical errors, you may have tons of logical errors. This bit rot shows up more often than you would expect, though it may not be very obvious initially. A few flipped bits won't bother you much (depending on their position) if you are storing pics, movies and mp3s, but could be disastrous if you have a spreadsheet for example. To test, I would md5 or sha all your documents, and then a week from now test against the hashes you made this week.
 
It most likely has to do with the firmware on your drives and on your card. Make sure you upgrade both. Here is an adaptec page with some info.

http://ask.adaptec.com/scripts/adaptec_itic.cfg/php.exe/enduser/std_adp.php?p_faqid=15355

There are some drives which kick up these warnings. Unfortunately, I don't believe your drive is listed as supported. I have seen these problems on Samsung drives before, and they are more insidious than they appear. While you may not have any physical errors, you may have tons of logical errors. This bit rot shows up more often than you would expect, though it may not be very obvious initially. A few flipped bits won't bother you much (depending on their position) if you are storing pics, movies and mp3s, but could be disastrous if you have a spreadsheet for example. To test, I would md5 or sha all your documents, and then a week from now test against the hashes you made this week.

well both my controller and drives all have the newest firmware that is out. Adaptecs list of supported drives havn't been updated in like 1½ years... So no idea if the 204UI is officially supported. And since i am out of the free support and have to pay like 100$ to find out my drives are not supported. That kinda sucks.
Could try and do some MD5 and see if any stuff goes wrong.

Think i may have to sell my 3805 and get a 5805, the 3-series pretty much sucks.
 
I would double check the drive compatibility for your drives and controller... this pdf for the 3805 and this list for the 5805 has a list of supported drives [and enclosures]. If its not on the list I would be wary about using the adapter in conjunction. If the drives aren't specifically supported by the manufacturer you are much more likely to have weird problems [like above].

I've had some similiar weirdness with 3ware cards and 7200.11 drives that became unsupported after a [support recommended] firmware upgrade, I had multiple drives falling out of a RAID6 array.
 
Ok I think I have a controller fault..

CN1 Dev0 keeps failing

I have tried a different hard drive (same model)
A different Power plug (unplugged my other drives outside my raid controller)

But the same CN1 Dev0 is failing.

I think I will call Adaptec
 
Is your drives on the supported list by adaptec?

Also did you try a new SFF whatever model cable you use?

And lastly disable each drives cache in the controller ctrl a menu and that might eliminate issues.
 
I read that as "Contain1 Device 0" is failing. which means one of your samsung drives might be going. Maybe do a smart check on it before blaming the card.

Hardware raid cards *do* go bad, but i would say 9/10 times its a hard drive going, or possibly in your case being incompatible.
 
I would double check the drive compatibility for your drives and controller... this pdf for the 3805 and this list for the 5805 has a list of supported drives [and enclosures]. If its not on the list I would be wary about using the adapter in conjunction. If the drives aren't specifically supported by the manufacturer you are much more likely to have weird problems [like above].

I've had some similiar weirdness with 3ware cards and 7200.11 drives that became unsupported after a [support recommended] firmware upgrade, I had multiple drives falling out of a RAID6 array.

yeah thx for the head up - i am also abit worried about it. Its not on the list and think i am gonna switch to a 6805 instead, a really nice one with dedicated Ubuntu drivers, a insane cpu for it and full support from Adaptec with my Samsung 204UI drives


Is your drives on the supported list by adaptec?

Also did you try a new SFF whatever model cable you use?

And lastly disable each drives cache in the controller ctrl a menu and that might eliminate issues.

sadly they are not on the list, its really outdated by like 1½ years now.
But yeah gonna try the drive cache this weekend and see if that fixes the last "problems". I did a disable on the NCQ and the aborted commands droped like a rock, but they still show up.

I read that as "Contain1 Device 0" is failing. which means one of your samsung drives might be going. Maybe do a smart check on it before blaming the card.

Hardware raid cards *do* go bad, but i would say 9/10 times its a hard drive going, or possibly in your case being incompatible.

the drives are brand new, have only run about 150 hour or so (if that much) and its 4 new drives and 2 old drives. So i dont think its the drives that are at fault.
 
Have 3805 controller with 6 HGST 7200 RPM 4TB drives in RAID6 array. No hardware/SMART errors whatsoever. Server 2003 x86.

Getting many thousands of "aborted commands" but this seems to have no effect on performance.

Very happy with the entire arrangement. For a couple of days, the "aborted commands" created some anxiety, but it's been several months now with no ill effect. It's an extremely positive thing that this old controller (has "latest" firmware, which is to say it's still not that recent) is running a 16TB array so nicely.

I asked Adaptec for some clarification, but was essentially told that this controller was out of warranty.

For what it's worth, the HGST drives are terrific. From many accounts, they are in general the most reliable drives out there - more reliable than WD drives (parent company of HGST).
 
Last edited:
As I mentioned in my reply to this thread just over three years ago, be very careful if you have mission critical data on these array with the aborted command errors. Way back when before we dumped Adaptec for Areca, we had issues with bit rot on the arrays. Run ZFS or some other checksummed filesystem to be safe in that case
 
We are using the array to store Symantec System Recovery backups. These are checksummed binary image files. I store them in blobs no larger than DVD-size (4.7GB).

For whatever reason, all file checks seem to be in order. I do them regularly. Not to mention that my occasional file restores are just fine.

Clearly YMMV so your advice should be taken seriously, mw.
 
Last edited:
Back
Top