Does smart data reset after a while?

Red Squirrel · Nov 29, 2015

I was curious and looking at the power on hours on some of my drives, and they are very low compared to what I'd exact. I have not replaced any drives in years and some are showing less than a year's worth of hours. I don't believe any of these drives go to sleep, I always make sure not to buy drives like that for my systems as they can be problematic as the raid software will think it failed.

Looking at one of my arrays, the oldest one, and most of the drives are in the 5000 hour range. That's less than a year. I know I've had them in there for several years now, not to mention the server uptime is 544 days, so even if it resets after a reboot it should at least match the uptime.

Am I maybe reading it wrong? Here's an example output for a WD black:

Code:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       1
  3 Spin_Up_Time            0x0027   173   173   021    Pre-fail  Always       -       4308
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       38
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   094   093   000    Old_age   Always       -       5088
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       36
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       24
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       15
194 Temperature_Celsius     0x0022   113   101   000    Old_age   Always       -       34
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       2

The start stop count seems a little high too, do WD blacks go to sleep? I've always had issues with this array, wondering if that's why?

SomeGuy133 · Nov 29, 2015

38 starts and stops is high? Let me know when it is in the 1000s or at least 100s. Not sure what drive had a couple K start and stops. I only image I got on PC is 500+ start and stops.

drescherjm · Nov 29, 2015

If the drive went to sleep it would show up in the start / stop count. You think at start / stop of 0x38 is high?

Does smart data reset after a while?

From looking at the SMART for several hundred drives and monitoring it with nagios and other software over the last decade or so I have not seen a single instance of a reset of the power on hours.

SomeGuy133 · Nov 29, 2015

drescherjm said:
If the drive went to sleep it would show up in the start / stop count. You think at start / stop of 0x38 is high?

yea 38 start stops in 5000 hours is solid. If he was in the 100s (like 500-1000) with 5000 hours there may be some concern about wear.

The only thing I recall resetting is error rates as in the ones that are created with bad cables. Once the issue is fixed the number goes back to 0 IIRC correctly. At least I thought that happened on one of my drives.

longblock454 · Nov 29, 2015

If it weren't for kernel updates, that count would be much lower:

Code:

  9 Power_On_Hours          0x0032   100   100   001    Old_age   Always       -       33781
 12 Power_Cycle_Count       0x0032   100   100   001    Old_age   Always       -       80

ToddW2 · Nov 29, 2015

longblock454 said:
If it weren't for kernel updates, that count would be much lower:

Code:

9 Power_On_Hours 0x0032 100 100 001 Old_age Always - 33781 12 Power_Cycle_Count 0x0032 100 100 001 Old_age Always - 80

nearly 2x a month over a 3.5 year period isn't exactly that impressive... just saying

Red Squirrel · Dec 1, 2015

Interesting that my power on hours are so low then as some of these drives have been running for 3+ years easily yet show less than 1. I'm more or less just curious about this more than anything so I looked into it deeper and it's only my WD blacks that seem to have low figures. So maybe these drives go to sleep? Could explain the issues I've had with this array. Could very well be that the old server put them to sleep too for some reason... I used to get LOT of issues with them when they were in that server. I'd figure it was maybe a year or two ago that I moved the and I could very well be off and it might be less... so maybe these figures sorta make sense.

This array was grown over the years too so it's normal that some are way different. But the oldest drive should be like 5 years old (as in, running time, my system is on 24/7) at least I would have figured... but maybe I'm thinking they're older than they really are. I did replace all of them at one point... so now that I think about it, maybe these numbers actually are realistic, especially if the old server controller was doing something funny with them like putting them to sleep. I'm not about to pull one out to check the date on it.

But think I may have solved my mystery, I did forget that I replaced every single one of them at one point, though seems that was still over a year ago.

This is it:

Code:

       0       8       32        0      active sync   /dev/sdc
       1       8      192        1      active sync   /dev/sdm
       2       8       48        2      active sync   /dev/sdd
       3       8       16        3      active sync   /dev/sdb
       4       8       96        4      active sync   /dev/sdg
       5       8        0        5      active sync   /dev/sda
       6       8       80        6      active sync   /dev/sdf
       7       8       64        7      active sync   /dev/sde


[root@isengard ~]# smartctl -a /dev/sdc | grep -i power_on
  9 Power_On_Hours          0x0032   093   092   000    Old_age   Always       -       5669
[root@isengard ~]# smartctl -a /dev/sdm | grep -i power_on
  9 Power_On_Hours          0x0032   093   093   000    Old_age   Always       -       5118
[root@isengard ~]# smartctl -a /dev/sdd | grep -i power_on
  9 Power_On_Hours          0x0032   086   085   000    Old_age   Always       -       10922
[root@isengard ~]# smartctl -a /dev/sdb | grep -i power_on
  9 Power_On_Hours          0x0032   093   093   000    Old_age   Always       -       5122
[root@isengard ~]# smartctl -a /dev/sdg | grep -i power_on
  9 Power_On_Hours          0x0032   087   086   000    Old_age   Always       -       9846
[root@isengard ~]# smartctl -a /dev/sda | grep -i power_on
  9 Power_On_Hours          0x0032   095   094   000    Old_age   Always       -       4233
[root@isengard ~]# smartctl -a /dev/sdf | grep -i power_on
  9 Power_On_Hours          0x0032   087   086   000    Old_age   Always       -       9971
[root@isengard ~]# smartctl -a /dev/sde | grep -i power_on
  9 Power_On_Hours          0x0032   087   086   000    Old_age   Always       -       9971
[root@isengard ~]# 
[root@isengard ~]# 
[root@isengard ~]# 
[root@isengard ~]# smartctl -a /dev/sdc | grep -i start_stop
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       46
[root@isengard ~]# smartctl -a /dev/sdm | grep -i start_stop
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       38
[root@isengard ~]# smartctl -a /dev/sdd | grep -i start_stop
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       14
[root@isengard ~]# smartctl -a /dev/sdb | grep -i start_stop
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       38
[root@isengard ~]# smartctl -a /dev/sdg | grep -i start_stop
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       19
[root@isengard ~]# smartctl -a /dev/sda | grep -i start_stop
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       29
[root@isengard ~]# smartctl -a /dev/sdf | grep -i start_stop
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       13
[root@isengard ~]# smartctl -a /dev/sde | grep -i start_stop
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       13

My newest raid array makes more sense:

Code:

    Number   Major   Minor   RaidDevice State
       0       8      112        0      active sync set-A   /dev/sdh
       1       8      144        1      active sync set-B   /dev/sdj
       2       8      160        2      active sync set-A   /dev/sdk
       3       8      128        3      active sync set-B   /dev/sdi
[root@isengard ~]# 
[root@isengard ~]# 
[root@isengard ~]# 
[root@isengard ~]# smartctl -a /dev/sdh | grep -i power_on_hours
  9 Power_On_Hours          0x0012   098   098   000    Old_age   Always       -       16650
[root@isengard ~]# smartctl -a /dev/sdj | grep -i power_on_hours
  9 Power_On_Hours          0x0012   098   098   000    Old_age   Always       -       16746
[root@isengard ~]# smartctl -a /dev/sdk | grep -i power_on_hours
  9 Power_On_Hours          0x0012   098   098   000    Old_age   Always       -       16746
[root@isengard ~]# smartctl -a /dev/sdi | grep -i power_on_hours
  9 Power_On_Hours          0x0012   098   098   000    Old_age   Always       -       16746
[root@isengard ~]# 
[root@isengard ~]# 
[root@isengard ~]# smartctl -a /dev/sdh | grep -i start_stop
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       5
[root@isengard ~]# smartctl -a /dev/sdj | grep -i start_stop
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       25
[root@isengard ~]# smartctl -a /dev/sdk | grep -i start_stop
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       22
[root@isengard ~]# smartctl -a /dev/sdi | grep -i start_stop
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       28

The drive with 5 start-stop was DOA, whjile the other 3 went in when I built the box, and were used to individually test each bay, so they got a lot of start/stop, and the server itself got rebooted a lot while being setup.

The hours on those drive seems about right.

My other array seems right too, it's newer than the one just above. It's just the WD blacks that seem very low even though that's my very first array and is older than the actual server, which has over a year of up time. (had to turn it down for power maintenance a few months after it was deployed)

Code:

    Number   Major   Minor   RaidDevice State
       0      65       48        0      active sync set-A   /dev/sdt
       1      65       64        1      active sync set-B   /dev/sdu
       2      65       80        2      active sync set-A   /dev/sdv
       3      65       96        3      active sync set-B   /dev/sdw
       4      65      128        4      active sync set-A   /dev/sdy
       5      65      144        5      active sync set-B   /dev/sdz
       6      65      160        6      active sync set-A   /dev/sdaa
       7      65      176        7      active sync set-B   /dev/sdab


[root@isengard ~]# smartctl -a /dev/sdt | grep -i power_on_hours
  9 Power_On_Hours          0x0032   083   083   000    Old_age   Always       -       12742
[root@isengard ~]# smartctl -a /dev/sdu | grep -i power_on_hours
  9 Power_On_Hours          0x0032   081   081   000    Old_age   Always       -       14590
[root@isengard ~]# smartctl -a /dev/sdv | grep -i power_on_hours
  9 Power_On_Hours          0x0032   083   083   000    Old_age   Always       -       12836
[root@isengard ~]# smartctl -a /dev/sdw | grep -i power_on_hours
  9 Power_On_Hours          0x0032   081   081   000    Old_age   Always       -       14551
[root@isengard ~]# smartctl -a /dev/sdy | grep -i power_on_hours
  9 Power_On_Hours          0x0032   082   082   000    Old_age   Always       -       13747
[root@isengard ~]# smartctl -a /dev/sdz | grep -i power_on_hours
  9 Power_On_Hours          0x0032   080   080   000    Old_age   Always       -       14899
[root@isengard ~]# smartctl -a /dev/sdaa | grep -i power_on_hours
  9 Power_On_Hours          0x0032   082   082   000    Old_age   Always       -       13725
[root@isengard ~]# smartctl -a /dev/sdab | grep -i power_on_hours
  9 Power_On_Hours          0x0032   084   084   000    Old_age   Always       -       12141
[root@isengard ~]# 
[root@isengard ~]# 
[root@isengard ~]# smartctl -a /dev/sdt | grep -i start_stop
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       24
[root@isengard ~]# smartctl -a /dev/sdu | grep -i start_stop
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       11
[root@isengard ~]# smartctl -a /dev/sdv | grep -i start_stop
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       12
[root@isengard ~]# smartctl -a /dev/sdw | grep -i start_stop
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       11
[root@isengard ~]# smartctl -a /dev/sdy | grep -i start_stop
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       16
[root@isengard ~]# smartctl -a /dev/sdz | grep -i start_stop
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       37
[root@isengard ~]# smartctl -a /dev/sdaa | grep -i start_stop
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       15
[root@isengard ~]# smartctl -a /dev/sdab | grep -i start_stop
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       3

This made for some fun statistics to pull, if anything.

I should probably write a script or app to automate this by guid and monitor these stats, actually.

Does smart data reset after a while?

Red Squirrel

[H]F Junkie

SomeGuy133

2[H]4U

drescherjm

[H]F Junkie

SomeGuy133

2[H]4U

longblock454

2[H]4U

ToddW2

2[H]4U

Red Squirrel

[H]F Junkie