Very odd SAS drive slowdown

ugly

n00b
Joined
Feb 4, 2011
Messages
53
I'm running a single SAS drive as its own filesystem within ZFS and am encountering a really strange problem.

Background:
I recently acquired Seagate Cheetah ST3300557SS, which is 300GB SAS 6Gb/s drive with encryption. I have it connected to a BR10i in one of the channels along with 7 other SATA disks on PCIe x8 lanes on x9scm-f board. Running ZFS as vm in esxi, filesystem of the drive is shared NFS.

Problem:
While running a benchmark write test (dd if=/dev/zero) to fill the entire drive with 0s, the write performance goes from ~150MB/s all the way down to ~10MB/s at random intervals. I can't reproduce same behavior every time, but the write operation slows down the drive when the drive is around 20GB full and 50GB full.

Even when attempting to transfer large files into the drive the transferring speed drops down to less than 10MB/s at random times.

In theory, the drive shouldn't have any "slowdowns" of any kind. I'm completely confused because I can't find the source of the problem. I tried variations of memory to the OS from 2GB to 8GB and the number of CPUs/cores assigned to the vm with same results.

Anyone see this kind of behavior before? How the heck do I solve this thing?
 
What is the version of the host OS, and are you having the same problem if you do the same thing from the host OS with the drive as, for example, ext3 or 4?
 
esxi 5, the VM controlling the SAS-drive/BR10i is Solaris 11. BR10i is delivered via VT-d into Solaris 11. The sucker was working perfectly fine, no slowdowns, with four striped disks of ST3300657SS-H (Dell's version of Cheetah) + 4 other SATA disks, in exact same configuration before.
 
I have one coming in tomorrow, although it's slightly different (Seagate Cheetah ST3300657SS, no encryption). I'll test and report back.
 
Results from iostat -xnzM 5:


extended device statistics

r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device
51.2 0.0 6.4 0.0 0.0 0.7 0.0 14.0 0 8 c9t3d0
50.2 0.0 6.2 0.0 0.0 0.7 0.0 13.6 0 8 c9t4d0
48.4 0.0 6.1 0.0 0.0 0.6 0.0 12.1 0 7 c9t5d0
49.6 0.0 6.1 0.0 0.0 0.6 0.0 12.9 0 8 c9t6d0
5.0 13.4 0.6 0.9 0.0 0.1 0.0 7.8 0 3 c9t13d0
4.2 15.6 0.5 0.8 0.0 0.1 0.0 4.6 0 3 c9t14d0
0.0 145.4 0.0 13.9 0.0 0.7 0.0 4.8 0 11 c9t15d0
0.0 120.8 0.0 11.5 0.0 8.1 0.0 66.7 0 99 c9t16d0


c9t15d0 is ST3300657SS (SAS drive without encryption) and c9t16d0 is ST3300557SS (SAS drive with encryption). The two drives are striped in ZFS.

Very odd numbers - as you can see, c9t16d0 is blocking 99% with average service time of 66.7, writing at 11.5MB/s. By comparison, c9t15d0 is blocking only 11% with average service time of 4.8. Because they both are exactly the same drives minus encryption in one of them, striped (raid-0) configuration should result in both drives behaving exactly the same way. Does encryption affect the performance of a drive this much?
 
I've seen disks do that before failing. What makes you think this drive is in good health?
 
Back
Top