HP Enterprise SSD Firmware Bug Causes them to Fail at 32,768 Hours of Use

erek

[H]F Junkie
Joined
Dec 19, 2005
Messages
10,897
HP has released an SSD firmware update that fixes this bug and cannot stress the importance of deploying the update enough. This is because once a drive hits the 32,768-hour literal deadline and breaks down, both the drive and the data on it become unrecoverable. There is no other mitigation to this bug than the firmware update. HP released easy to use online firmware update tools that let admins update firmware of their drivers from within their OS. The online firmware update tools support Linux, Windows, and VMWare. Below is a list of affected drives. Get the appropriate firmware update from this page.

https://www.techpowerup.com/261560/...m-to-fail-at-32-768-hours-of-use-fix-released
 
There's a Reddit thread about this, and for those who didn't click through erek's link, it's even worse than the summary. Stood up a RAID array of these SSDs a few years ago? How'd you like to have 6 drives all fail in a 15-minute period? At least two people reported that.
 
And they weren't kidding about no mitigation and the word "unrecoverable". Plug a dead drive into a new machine, and the controller won't recognize it (one person said "it didn't even read the serial number".) Better hope you had a backup on external media, because if your drive hits the limit, it's definitively bricked.
 
Fascinating. I'm guessing hours is somehow stored in a 16-bit signed integer and once it reaches the value of 32767 (2^15), the next increment wraps it to negative values. The negative value is probably the trigger for the bug. Sloppy software design there on HP's part.

32768 hours is approximately 3.74 years. That should give an IT person some window to gauge how worried they should be right now.
 
Sigh. HP gonna HP. Their computers are just as reliable as their printers, which is to say not very :p
 
The last time I had an HP laptop, it was great...until it suffered a CPU fault after 8 months and wouldn't start (not "wouldn't boot", mind. It wouldn't even POST, just flashed the LEDs to indicate a CPU failure.) They ate up the rest of my warranty repeatedly "replacing the motherboard", although I suspect they didn't actually do that, and not actually fixing the problem.
 
I think it is amazing that there is no way to force a firmware update to the drive. I guess the controller needs to be "alive" to so, and it bricks itself the first second it reads the negative uptime value from flash memory. No way to create a tiny bootloader firmware that zeros out the SMART value for uptime that went negative and then flash the fixed firmware?

Ugh. Just so bad on many levels.
 
I think it is amazing that there is no way to force a firmware update to the drive. I guess the controller needs to be "alive" to so, and it bricks itself the first second it reads the negative uptime value from flash memory. No way to create a tiny bootloader firmware that zeros out the SMART value for uptime that went negative and then flash the fixed firmware?

Ugh. Just so bad on many levels.

If the drive won't even talk to the controller, how are you going to flash it? People on Reddit said they put the dead drives in different computers and the drives weren't recognized. Nothing you can do about that.

Oh, if you sent it back to HP they could probably force an update through JTAG pins or something, but that's not consumer- or even enterprise-level stuff. Also, based on my experience with HP, they'd probably just dip it in pizza sauce and send it back to you.
 
If the drive won't even talk to the controller
I assumed the firmware was stored on a separate and much smaller flash chip, similar to a 32Mbit BIOS flash chip on a computer. Apparently that isn't the case. I guess they just reserve a small portion of the main flash for firmware storage. In that case, yeah, it's toast if the SSD's CPU locks up the moment it reads the negative uptime value.
 
Back
Top