S3 sleep frienly RAID controller

ptr727

n00b
Joined
Jul 14, 2009
Messages
23
Hi, I am looking for an S3 sleep friendly high performance RAID controller to use in my workstations with external storage.

The basic problem I have with all the controllers I tested is that they do not spin down the drives when the system enters S3 sleep, and worse when the system shuts down.

This means that when my workstation goes to sleep or powers down, the drives in the external enclosures keep on spinning, keep on generating noise and heat, and keep on using electricity.

The only controller that worked as expected was the cheapest and slowest, a motherboard integrated Intel ICH10R chipset, and this does correctly spindown the drives.

On the higher end cards I tested Adaptec 5445, 3Ware 9690SA-4I, and Areca ARC-1680LP.
(I also have a LSI MegaRAID SAS 8888ELP on order)

The Adaptec not only did not spin down the drives, it also sends an alert every time it wakes up that the battery failed, which is not true since the next log entry says is battery is working fine.

Does anybody have any advice on how to configure these cards to make them spin down the drives before sleeping or powering off?

Or can anybody recommend a different card that does play nice with S3?

Thanks
P.
 
The battery DOES fail in S3. No, you don't get to argue - it's a stupid on Adaptec's part. As soon as you go to S3 on any RAID card, you're in battery discharge. S3 screws up some controllers to the point where they're convinced the battery's failed.

S3 is not appropriate for high end RAID cards. None of them support it. You're not supposed to do S1 or S3 in a server, where they typically go. I specifically prohibit S1/S3 on systems I work with. If you want power savings, Hibernate. That's the only option. You need to do a full power down and power up including POST.
 
The battery DOES fail in S3. No, you don't get to argue - it's a stupid on Adaptec's part. As soon as you go to S3 on any RAID card, you're in battery discharge. S3 screws up some controllers to the point where they're convinced the battery's failed.

S3 is not appropriate for high end RAID cards. None of them support it. You're not supposed to do S1 or S3 in a server, where they typically go. I specifically prohibit S1/S3 on systems I work with. If you want power savings, Hibernate. That's the only option. You need to do a full power down and power up including POST.

I think it is as appropriate for me to use a high performance RAID controller in my workstation as it is for my server.

S3 support is a requirement for WHQL certification and driver signing, and a requirement for Vista and W2K x64 drivers.
I can only assume that the certification does not, yet, extend to usability but only stability.

The batteries in the controllers I tested lasts anywhere from 1 to 4 days.

The OS notifies the driver of a power state change to S3, the controller has sufficient time to flush the cache, meaning that even if the battery were to drain it would not matter.

The motherboard continues to supply power to the PCIe bus, allowing controller cards to keep power to whatever they want, including cache memory, or even charging the battery.

There is nothing that prevents the RAID manufacturers from properly supporting S3.

So my question still stands, are there controllers that do work correctly, or settings that can be applied to the controllers I tested to make them work?

Thanks
P.
 
I think it is as appropriate for me to use a high performance RAID controller in my workstation as it is for my server.

Now, here's where the fun is. I don't disagree. The problem is in how S1/S3 work versus D0 + Hibernation. These cards are NOT designed for S1/S3, and that's not likely to change any time soon. The general opinion of the manufacturers is that if you're going to have something with 256MB of cache and a BBU, it better be on a UPS and always on.

S3 support is a requirement for WHQL certification and driver signing, and a requirement for Vista and W2K x64 drivers.
I can only assume that the certification does not, yet, extend to usability but only stability.

You know the saying about assuming. It's never untrue.
All that's needed to pass PCI compliance and WHQL is for the card to not crash the system when entering S1 or S3, or returning to D0 from either. That's it. The assumption that it must work? Complete fabrication and fantasy. There is absolutely no functional requirement beyond not bombing the system. The end.

The batteries in the controllers I tested lasts anywhere from 1 to 4 days.

The OS notifies the driver of a power state change to S3, the controller has sufficient time to flush the cache, meaning that even if the battery were to drain it would not matter.

They're rated for N hours - typically 72. Past rated hours, corruption risk exceeds ECC tolerances. Adaptec is just completely braindead in how they handle S1/S3 and return to D0 in that battery discharge in states that are not OFF or D0 is reported as battery failure. (If anything, it probably should be reported as a charge circuit failure.)
And again, you're operating on assumption. S1/S3 notification does NOT have to result in cache flush, or ANYTHING other than the card going to S1/S3 without crashing the system. THAT IS ALL THAT HAS TO BE DONE TO PASS. Cards typically do NOT flush on S1/S3, either.

There is nothing that prevents the RAID manufacturers from properly supporting S3.

There's a lot. Come back and make that statement when you've been designing RAID ASICs for 10+ years. I've had talks with people who know this better than even I do. The ignorant assumption that they can just keep a charge circuit running unmanaged, and magically handle state changes as if they were nothing, is just that - an ignorant assumption.
Taking any device from S1/S3 to D0 is NOT an easy task. Things do NOT just magically work. Registers must be cleared and reset, FIFOs reset, state redetermined and reestablished. We do not just wave some magic wand in the driver and say "poof, you're back at D0." That's NOT how it works.
(Let me clarify; OH GODS, do I WISH it worked that way. But really, it doesn't. And disk I/O is easily one of the worst areas there. The ICH10R actually lies about cache flushing because it gets lied to by the SATA drives. If the drives don't spin down appropriately, you will lose data.)
 
Last edited:
Thanks for the detailed insight.

So what you are really saying is that I won't find any half decent controller that plays nice with S3 sleep :(

I really won't mind S4 hibernate, except I can't hibernate a system with 8GB RAM :(

P.
 
Thanks for the detailed insight.

So what you are really saying is that I won't find any half decent controller that plays nice with S3 sleep :(

I really won't mind S4 hibernate, except I can't hibernate a system with 8GB RAM :(

P.

Hey, any time. Sorry if it sounded grumpy - I've written S1/S3->D0 transition for Ethernet and WiFi, so I'm a little touchy about it. (Most annoying code -ever-. Seriously.)

Unfortunately, that really does seem to be the case. However, there IS a way to do hibernation with >4GB systems, depending on the OS. What're you running right now? I'll start scrounging up the directions.
 
I am currently running Vista Ultimate x64, soon to switch to Windows 7.
I've heard Win7 supports hibernation for systems with more than 4GB?

I still wonder though, if the controller does not spin down drives when powering down, I doubt it will spin them down when going to S4.
Seems that spinning down drives would be much simpler problem compared to managing the cache?

P.
 
I am currently running Vista Ultimate x64, soon to switch to Windows 7.
I've heard Win7 supports hibernation for systems with more than 4GB?

Yes.
In Vista Ultimate x64, first make sure you're at SP2 or later. Then go to an elevated permission command prompt. Do this:
Code:
powercfg -h on
That should restore hibernation functionality.

I still wonder though, if the controller does not spin down drives when powering down, I doubt it will spin them down when going to S4.
Seems that spinning down drives would be much simpler problem compared to managing the cache?

P.

Hibernation basically writes everything out to disk and goes through a shutdown. The system is for all intents and purposes, off. That's where the difference is; power is removed from the relevant devices. The system goes through a shutdown, so any well behaved controller should flush the cache. That means that when you come back online, the RAID controller goes through BIOS init, which includes spinning up the drives again. RAID itself isn't designed to spin down drives, because all drives need to operate in sync. (In times past, synchronizing spindles precisely was also a critical thing.) MAID is designed for spin-down, but offers poor performance.
 
Hibernation basically writes everything out to disk and goes through a shutdown. The system is for all intents and purposes, off. That's where the difference is; power is removed from the relevant devices. The system goes through a shutdown, so any well behaved controller should flush the cache. That means that when you come back online, the RAID controller goes through BIOS init, which includes spinning up the drives again. RAID itself isn't designed to spin down drives, because all drives need to operate in sync. (In times past, synchronizing spindles precisely was also a critical thing.) MAID is designed for spin-down, but offers poor performance.

All the controllers support spinning down drives at idle, so I guess it is a form of MAID?

Coming back to power, drives on the same power supply as the PC will be powered down when the main PSU turns off, but drives on the enclosure PSU must be logically spinned down else they keep on running.

Do you think it be reasonable to expect the controller to spin down the drives when powering down, and to pursue this with the vendors?

P.
 
All the controllers support spinning down drives at idle, so I guess it is a form of MAID?

Coming back to power, drives on the same power supply as the PC will be powered down when the main PSU turns off, but drives on the enclosure PSU must be logically spinned down else they keep on running.

Do you think it be reasonable to expect the controller to spin down the drives when powering down, and to pursue this with the vendors?

P.

That's an OS function. No RAID controller performs idle spindown of drives. The OS can issue the power down command; it's up to the RAID controller if it permits it, and from there, how it handles it.

On a shutdown, the RAID controller has two options for external drives. It can spin them down, or it can not spin them down. All RAID controllers start drive checks with READY_CMD check. That means if a drive isn't spun down, it's immediately available. If it's not, it spins it up. They can handle that however they want. The important part is that they perform a cache flush regardless. Nobody's going to agree on how to handle it beyond that.
 
That's an OS function. No RAID controller performs idle spindown of drives. The OS can issue the power down command; it's up to the RAID controller if it permits it, and from there, how it handles it.

While looking for a new power saving server RAID controller, I found and tested the following:

The Adaptec controller does not support Windows power management, but does support idling and spinning down drives, see:
http://www.adaptec.com/en-us/_common/greenpower/?hpBan=greenPower-Text-US

The 3Ware controller supports ATA passthough (or whatever it is really called), and Windows power management, and Windows will spin down the drives at idle.

The Areca controller does not support Windows power management, but does support spinning down idle drives.

P.
 
The problem therein is determining what constitutes an "idle" drive.

In a RAID5, no drive is ever idle. Data is striped across all drives, so all drives are always active for any single read, of necessity. RAID1 and RAID0 same thing. The only drive that can ever be truly idle is the hot spare. Otherwise, the whole array is either idle or not. You can't pick and choose drives, without breaking the array and introducing problems.
 
I received a MegaRAID SAS 8888ELP today.

S3 does not work at all, on going to S3 sleep machine enters perpetual sleep wake sleep wake sleep.
Cycle can only be interrupted by pulling the power.

So much for hoping the LSI would work :(
 
I am seriosuly suprised your LSI can't handle s3.

You might try a PERC 6/i. Mine works fine with s3. Drives spin down, wakes up perfectly.

I am in the middle of a rebuild (3x WD640 raid 5, 1 hotspare) and I forced s3 to test.

Went to sleep and woke up just fine and even resumed the rebuild where it left off.

I am running the mose recent LSI firmware (same firmware used in your 8888ELP)
 
Taking any device from S1/S3 to D0 is NOT an easy task. Things do NOT just magically work. Registers must be cleared and reset, FIFOs reset, state redetermined and reestablished. We do not just wave some magic wand in the driver and say "poof, you're back at D0." That's NOT how it works.
(Let me clarify; OH GODS, do I WISH it worked that way. But really, it doesn't. And disk I/O is easily one of the worst areas there. The ICH10R actually lies about cache flushing because it gets lied to by the SATA drives. If the drives don't spin down appropriately, you will lose data.)

Clearing and resetting registers? Resetting FIFOs? Establishing states? If RAID card developers are struggling with these simple tasks, we've got bigger problems. Basically, this is some hand-waving to explain a very complicated problem.

In reality, coming out of S1/S3 is complicated, but I can't think of any reason why it would be impossible. More likely is the fact that 99.9% of the target audience for these cards is going to leave them on 100% of the time. No use dedicating development/test/validation cycles for every firmware revision to keep up with the intricacies of the ACPI sleep states across different implemenations.

Good to know that the Perc 6/i can do S3, though. Thanks for the info, okashira.
 
Clearing and resetting registers? Resetting FIFOs? Establishing states? If RAID card developers are struggling with these simple tasks, we've got bigger problems. Basically, this is some hand-waving to explain a very complicated problem.

In reality, coming out of S1/S3 is complicated, but I can't think of any reason why it would be impossible. More likely is the fact that 99.9% of the target audience for these cards is going to leave them on 100% of the time. No use dedicating development/test/validation cycles for every firmware revision to keep up with the intricacies of the ACPI sleep states across different implemenations.

Good to know that the Perc 6/i can do S3, though. Thanks for the info, okashira.

He wasnt saying they cant do it or are having trouble with it.
He was saying is they are not designing them to do it. 99% of time boxes that use HW raid dont go to sleep or even get turned off so why would the developers waste time writing FW or drivers to handle it?
 
He wasnt saying they cant do it or are having trouble with it.
He was saying is they are not designing them to do it. 99% of time boxes that use HW raid dont go to sleep or even get turned off so why would the developers waste time writing FW or drivers to handle it?

Which is exactly what I said (and you quoted) too.


We've got one report of the Perc 6/i working. Has anyone had any luck with the Perc 5/i?
 
Good to know that the Perc 6/i can do S3, though. Thanks for the info, okashira.

Yeah, it's odd that the 6i is able to, but the 8888ELP can't. They are both based off the same SAS1078 controller and run the same firmware stack.

I was personally and extensively involved with the validation of the power save features of the SAS1078 controller and there is no reason that an HBA based on that controller couldn't do proper transitions in and out of the S1/S3 (and corresponding D1/D3) power states.

The firmware on the controller handles a transition into either of those low power states in a very similar way that it handles a regular power down -- there really isn't a huge difference between the two. So the write cache will be properly flushed and the state of the disk arrays will be maintained. Coming out of D1 or D3 is basically treated the same as a fundamental reset on PCIe, so everything will be restarted as if the system were reset. That, of course, relies on the OS and driver to properly restore the state of the Base Address Registers and any necessary messaging queues.

One thing that we do come across on occasion is a weakness that was present in the PCIe 1.0a specification (which is the spec that the SAS1078 controller conforms to). This has to do with how PCIe link power management state transitions are affected by PCIe flow control. This was clarified and patched up in the 1.1 and following specifications, but there may still remain cases where a 1.0a device is talking to a 1.1 or 2.0 compliant device and the link is unable to correctly transition due to the differing interpretations of the flow control and power management interactions from the 1.0a spec.
 
...the drives in the external enclosures keep on spinning...

Regarding why some people could have success while others fail, I think that "external enclosures" is the key here. Every external enclosure that I have ever worked with, recommends that he external enclosure be started before the OS. Even if the raid controller told the external enclosure to power down certain disks, the enclosure / expander / backplane might ignore it.

Do you think it be reasonable to expect the controller to spin down the drives when powering down, and to pursue this with the vendors?
Sure, for software raid. But not for hardware RAID disks. I wouldn't be comfortable with a raid array spinning down... too scary.
A raid set might have two disks, or it might have 16 disks. Many raid controllers are "system on a chip", and this "really is" an OS running on the chip in the controller. On top of that, the raid controller may be set to spin-up 2 disks at a time, with a delay of 2 seconds between each group of 2 disks. On some controllers, you can't configure how many disks spin-up or the delay between the spin-up groups.

Now think of how long it takes for some cards to configure themselves during boot... If your system goes to sleep with drive X in use, it expects drive X to be there when it wakes up. But on wake, your raid card needs to boot itself and spin-up all of the disks in the array. This may take 30 seconds. Then consider that someone could set their system to sleep after two minutes... You've got the disks spinning down multiple times per day, and the OS might think that the disk is missing or failed if it isn't available at wake.

If the OS is on a raid set connected to the raid controller, then you have a real conundrum.

All the controllers support spinning down drives at idle, so I guess it is a form of MAID?
When I think of MAID, I think of single disks. It sounds like you want a MAID of RAID, and I don't know if that's possible or even wanted (see above). The LSI raid controllers that I use do spin down idle disks, but they only consider a disk as idle, if it isn't in a raid array.

A sas/sata hba combined with software raid might do this for you. A sas/sata hba would just import 6 disks in an enclosure as 6 direct connected disks.

...
On the higher end cards I tested Adaptec 5445, 3Ware 9690SA-4I, and Areca ARC-1680LP.
(I also have a LSI MegaRAID SAS 8888ELP on order)...
It sounds like you are spending a lot of time and money to solve a minor problem. What could/should be possible, is not always practical. If 9/10 cards don't do what you want, but one does, I would err on the side of caution, and not use the feature that every other card is avoiding.
 
Hello,

Well I suppose it depends on your motherboard too. I was looking for state preserving RAID controller(S4, S3), but since my motherboard does not support S4, I ended up looking for S3 capable RAID card.
Tried various LSI options - no way.
I almost came to conclusion that there are no S3 capable RAID controllers. And then I bought my last option Adaptec 6805E. And S3 works fine now! (Intel S1200BTL + Adaptec 6805E). The card also supports various power options for spindowns.
So maybe all Adaptec 6th edition controllers do ?

Anyway I suppose S3/S4 is needed when you run workstation type PC - no need for server.

Good luck!
Vilius
 
I have an ARC-1880x hooked up to two HP SAS expanders in 15 disk enclosures and the idle power down works just fine. My disks are setup to power down after 30 minutes of idle activity.

This probably does depend on your SAS expander (if in an enclosure that uses one) though. Also the fans are still running in the enclosures so its not like power usage goes down all the way either.
 
Which is exactly what I said (and you quoted) too.


We've got one report of the Perc 6/i working. Has anyone had any luck with the Perc 5/i?

I was doing research into the cause of death of my Dell Perc 6i card and I found this thread.

I had this card in a media center pc connected to 3 drives running a raid 5 array. It ran for a solid 2 weeks without hiccups going into and out of suspend maybe 4 to 5 times a day. Then the system started to periodically freeze. I started to worry so one day I decided ill back up everything to an external. Soon as I fired up the backup I got a BSOD. I restarted the system and got a "Disk boot failure......". Opened the case, no heartbeat on the controller.

At this point the controller no longer identified itself during post. After maybe 25 restarts and pulling it in and out of the system it started identifying itself again and even then sporadically. When it did, it would report "F/W is in fault state". Dell's white papers mention the solution as "Please contact dell support". Somehow after another 20 restarts I was able to re-flash the firmware and while it finished successfully it still refused to work.

I got a new controller, imported the array and got my data off it ASAP. Funny enough, afterwards I put the broken controller back in(after the system sat for 3 days) and windows booted up. After it booted may 3 times I decided i'll let it run and see what happpens. I closed up the case and stuck it back into the stand. Soon as I fired it up, same problem as above(disk boot failure). I could not get it to come back so I swapped back to the new card and the system has been running since last night.

Then I started searching for the cause of death and found this thread. In hindsight, upon the first bootup I should have checked the log.

Coincidentally I had also glued a zalman north bridge heatsink using arctic silver adhesive to the processors heat spreader a few nights before the issues began manifesting, but I doubt that was the cause of death.

So my question is, could the constant use of suspend have killed the firmware?
 
Last edited:
Back
Top