SSD Life Left SMART Warning With Only 4TB Written

jimnms

Gawd
Joined
Mar 15, 2003
Messages
882
Are there other factors besides writes that affect the SSD Life Left SMART flag? I had a sudden surprise last night when I RDP into my Windows 7 Pro machine that I use Media Center for recording TV, Steam Game Streaming to my TV and a few other things. There was a an error saying Windows detected a hard disk problem on my C: drive, which is an SSD. The first thing I did was update and run the SSD's diagnostic to see what was going on. It does show a warning on the drive health, but it shows 100% estimated life remaining. I ran all of the diagnostics and it passes. It only shows the raw SMART data and doesn't flag the problem. It also reports the total bytes written in the drive's life at only 4.12TB. The drive just has Windows and a few programs on it, everything else is written to HDDs.

I also have GSmartControl installed, so I updated it and it shows the SSD Life Left flag as the cause for the SMART warning (current value is 6, threshold is 10), which I thought was odd because the drive's diagnostic says that is fine.

I downloaded and installed SSDLife Pro for a 3rd opinion, and it shows the drive health is at 6% and estimates it will fail in less than a month! If the threshold for the warning should have kicked in when it dropped at or below 10%, either Windows didn't get the message, or it dropped suddenly for some reason.

With so few writes to the SSD, I didn't think it would be failing even though it is old. The system does run 24/7 and only reboots once a month for updates. The SSD utility shows 5 years 8 months of power on time with 191 power cycles and 4.12TB of writes. For comparison, the SSD in my gaming PC has a little over 1.5 years of power on time with almost 1300 power cycles and 20TB written. SSDLife Pro shows its health at 100% and estimates 9 years of life left.

I'm already planning to cut my cable after the summer, so I won't need Media Center for recording TV but I need to keep it for Game Streaming and other uses. With Windows 7's EOL coming in January, was going to go with Linux unless MS offered another opportunity to upgrade to Windows 10 for free, but for now I still need Media Center for recording TV and don't have the time to deal with changing OSes right now.

Right now I see 3 options:
  1. Spend money and save a bit of time by buying a new drive and cloning the failing one.
  2. Spend time and Install Windows to one of the HDDs already in it and set everything back up.
  3. Ride it out and hope it doesn't fail until I after the summer when I had planned working on it anyway. Nothing important is installed on it, so I won't lose anything if it fails and I will have to go with option 1 or 2 anyway.
 
Just kind of throwing a few ideas out here but unfortunately no solutions. If the diagnostics are accurate then a few possibilities that all point the same direction.

1. TRIM wasn't functioning like it should and your writes were repeatedly going to the same cells instead of being leveled across others.
2. Defective cells? Haven't heard of such a thing but I guess anything is possible.
3. The SSD controller has been malfunctioning.
4. Maybe an unusual power event damaged it at one point.
5. Maybe it overheated at some point and was damaged or has been operating at high temps for so long that it's shortened the lifespan.
 
You make no mention of what SSD this is and your signature doesnt state any storage.
How full is / has the drive been? This could explain why the failure if a small number of cells have had high writes.
If its always had plenty of space free I would contact the mfr to ask why this can happen.
 
I also wanted to point out that if you download the Windows 10 Media Creation Tool, you can activate Windows 10 with your existing Windows 7 license key. GL!

Silver
 
Just kind of throwing a few ideas out here but unfortunately no solutions. If the diagnostics are accurate then a few possibilities that all point the same direction.

1. TRIM wasn't functioning like it should and your writes were repeatedly going to the same cells instead of being leveled across others.
2. Defective cells? Haven't heard of such a thing but I guess anything is possible.
3. The SSD controller has been malfunctioning.
4. Maybe an unusual power event damaged it at one point.
5. Maybe it overheated at some point and was damaged or has been operating at high temps for so long that it's shortened the lifespan.

1: It is Windows 7 Pro SP3, which is supposed to support TRIM. SSDLife Pro showed it supported and enabled, but I just checked to make sure with the "fsutil behavior query DisableDeleteNotify" command and it is both supported and enabled according to Windows 7.
2 & 3: I have no idea, but there are no disk errors in the event viewer prior to last night when this error happened.
4: The system is on a UPS and is set to shut down after 15 minutes on battery because usually if the power isn't back on by then, it's going to be a while.
5: The max temp shown by the SMART is 47°C (It's currently sitting at 30°C)

You make no mention of what SSD this is and your signature doesnt state any storage.
How full is / has the drive been? This could explain why the failure if a small number of cells have had high writes.
If its always had plenty of space free I would contact the mfr to ask why this can happen.

It's a 64GB ADATA SP900. The drive has 10% unallocated for over-provisioning, a 100MB EFI partition and the rest is partitioned for C:. The C: drive is only a little over half full (29.8GB) and besides temporary usage, it's had the same programs on it all this time. I somehow don't think the manufacturer will care about a 6 year old drive, but I guess it can't hurt to ask.

I also wanted to point out that if you download the Windows 10 Media Creation Tool, you can activate Windows 10 with your existing Windows 7 license key. GL!

I thought they ended that?
 
In all reality the app is probably interpereting the SMART values wrong and the SSD is fine. Post a screenshot of the latest version of CrystalDiskInfo against that disk.
 
Product can have defective. its not just writing to the SSD that the sole source of SSD breaking down

What did the smart info say specifically? You left out probably the most important part. the actual diagnostics info.
 
a 6 year old drive,

Things could be fine but 6 years is starting to reach a limit. My oldest are Intel's at 7(320 series) and 8(520 series) years respectively but if I started getting errors I'd just call it good and move on. Like extide mentioned, it's possible that S.M.A.R.T is interpreting it wrong but at six years with any kind of significant error I'd be concerned. Out of 30-50 ssd's between work and home I've only seen 3 fail and unfortunately without errors reported beforehand. Both were vastly different in brand and use but the end result was the same-data was unrecoverable. I know you mentioned you weren't concerned with that but I'd recommend just moving on at this point.
 
CrystalDiskInfo shows the same thing as GSmartControl and SSDLife. The only SMART value below its threshold is the SSDLife. Here is the output from smartctl:

Code:
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0033   095   095   050    Pre-fail  Always       -       0/102119005
  5 Retired_Block_Count     0x0033   100   100   003    Pre-fail  Always       -       0
  9 Power_On_Hours_and_Msec 0x0032   043   043   000    Old_age   Always       -       50058h+26m+35.330s
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       191
171 Program_Fail_Count      0x000a   000   000   000    Old_age   Always       -       0
172 Erase_Fail_Count        0x0032   000   000   000    Old_age   Always       -       0
174 Unexpect_Power_Loss_Ct  0x0030   000   000   000    Old_age   Offline      -       76
177 Wear_Range_Delta        0x0000   000   000   000    Old_age   Offline      -       5
181 Program_Fail_Count      0x000a   000   000   000    Old_age   Always       -       0
182 Erase_Fail_Count        0x0032   000   000   000    Old_age   Always       -       0
187 Reported_Uncorrect      0x0012   100   100   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0022   030   047   000    Old_age   Always       -       30 (Min/Max -21/47)
195 ECC_Uncorr_Error_Count  0x001c   120   120   000    Old_age   Offline      -       0/102119005
196 Reallocated_Event_Count 0x0033   100   100   003    Pre-fail  Always       -       0
201 Unc_Soft_Read_Err_Rate  0x001c   120   120   000    Old_age   Offline      -       0/102119005
204 Soft_ECC_Correct_Rate   0x001c   120   120   000    Old_age   Offline      -       0/102119005
230 Life_Curve_Status       0x0013   100   100   000    Pre-fail  Always       -       100
231 SSD_Life_Left           0x0013   006   006   010    Pre-fail  Always   FAILING_NOW 1
233 SandForce_Internal      0x0032   000   000   000    Old_age   Always       -       19457
234 SandForce_Internal      0x0032   000   000   000    Old_age   Always       -       4221
241 Lifetime_Writes_GiB     0x0032   000   000   000    Old_age   Always       -       4221
242 Lifetime_Reads_GiB      0x0032   000   000   000    Old_age   Always       -       3972

ADATA SSD Toolbox:
ADATA_SSDTB.png

SSDLife Pro:
SSDLife.png

Things could be fine but 6 years is starting to reach a limit. My oldest are Intel's at 7(320 series) and 8(520 series) years respectively but if I started getting errors I'd just call it good and move on. Like extide mentioned, it's possible that S.M.A.R.T is interpreting it wrong but at six years with any kind of significant error I'd be concerned. Out of 30-50 ssd's between work and home I've only seen 3 fail and unfortunately without errors reported beforehand. Both were vastly different in brand and use but the end result was the same-data was unrecoverable. I know you mentioned you weren't concerned with that but I'd recommend just moving on at this point.

Yes, 6 years is a long time for a storage device. I'm not really complaining about that, although I didn't thing time would be as big of a factor on an SSD as it is for a mechanical drive with moving parts. It's the timing of the situation. I only needed this system to last a few more months before I had planned to make some changes to it, but I've either got to do something now or just hope it lasts. I do find it a bit suspicious that the drive starts to "fail" exactly after passing 50,000 hours.

I am leaning toward just going ahead and replacing it. It will be less time consuming to get a new drive and clone the old one rather than have to install Windows and set everything back up later if it does fail. I looked at some replacements last night. A 64GB SSD is pretty cheap, but it's only $10 more for 120GB, or $20 more for 250GB. That seems to be the sweet spot in terms of price per capacity.
 
Totally understand. On a consumer level a lot of us are part of the 1st or 2nd generation of 'affordable' SSD's that started to hit the market 5-10 years ago. This means we're also the 1st or 2nd test market to see actual real world use and age results as opposed to the labs or sites who just brute forced writes/reads to test endurance. Ironically though with the advances of TLC/MLC, 3d, different controllers, dram or dram less, etc., there's so many newer approaches in design it's become a whole new Pandora's to guess how long they'll last.
 
A 64GB SSD is pretty cheap, but it's only $10 more for 120GB, or $20 more for 250GB.

Yeah at those sizes prices are very affordable these days. I spend time fantasizing about 2TB+ but I just can't justify it right now.
 
Back
Top