OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

And you Hardforumer, what do you think about this trick ?

Its not a trick, its a way to improve performance with the danger of data-loss on power-loss.
In extreme, you may loose about last 5s of data that are confirmed to be on disk prior power-loss.

If there are no writes in the last 5s, you have no data loss. If it was a transaction on a VM it may become corrupted.
If you have dependend financial transactions, your data may be inconsistent.

If you want to have it cheap with max performance and not too much danger, disable sync and add a UPS


The problem:
ZFS usually collects small async random writes in RAM and flush them in one large sequential write after a few seconds.
This can be more than 10x faster than small sync writes without a dedicated log device.

A slow log device can improve but only a little. You need very low latency and best write performance to achieve similar
values like with sync=disable. I have not yet seen or heard of a SSD than can really improve things when compared to
sync=disable or a RAM based log device. Thats a pity because you need only a few GB. You may look at a cheaper alternative
to a Zeus od DDRdrive disk. like Acard RAM based disks.
 
Last edited:
IYou may look at a cheaper alternative
to a Zeus od DDRdrive disk. like Acard RAM based disks.
Zeus : hard to find in Europe.
... ZeusIOPS® SSD SLC / MLC for ZIl are quite similar as Intel 520
... ZeusRAM™ SSD are very expensive (2488 € ex VAT = $ 2969 ) :eek:

Intel 520 can perform in write : Random IOPS (4k Block) up to 80 000 @ 500 MB/sec.
Sounds great for ZIL, 5 years of warantty, but it's MLC and hasn't supercap functionality :rolleyes:

May a Raid10 of Intel 520 be ok ?

Cheers.

St3F
 
Intel 320 = MLC ... for ZIL ?!?
Max Read ; 270 MB/s
Max Write : 130 MB/s
Max Read IOPS : 38 000 (Random 4k)
Max Write IOPS : 14 000 (Random 4 Ko)​
... not much enough ! :/
I don't wanna use Esxi btw.

What about these hypotheses for ZIL ?
- OCZ Vertex 3 Max IOPS SATA III 120 GB wich perform Random IOPS (4k Block): up to 85 000 and R/W @ up to R/W 500MB/s ? ( ~158 € ex VAT) ? ... they are MLC, but if I buy 4, in Raid 10 could be ok ?
- OCZ RevoDrive 3 X2 Max IOPS PCI-Express SSD wich perform Random IOPS (4k Block) up to 220 000 @ R/W 1 500 MB/s (~668 €ex VAT) ? ... they are MLC, but if I buy 2, in Raid 1 could be ok ?
- OCZ Deneva 2 C Sync 60 Go wich perform Random IOPS (4k Block) up to 65 000 @ R/W 500 MB/s (~ 153 € ex VAT) ... they are MLC, but if I buy 4, in Raid 10 could be ok ?
- Plextor M3 Pro 128 Go wich perform Random IOPS (4k Block) up to 75 000 @ R/W 500 MB/s (~ 145 € ex VAT) ? ... they are MLC, but if I buy 4 in Raid 10 could be ok ?

Cheers.

St3F

I've tested a number of SSDs now. The listed specs are not useful. The specs that they give you are for high-queue depth asynchronous writes, the exact opposite of what the ZIL does (single or low queue depth sync writes).

The best cheap, protected SSD I've used so far is the Intel 320, 300gb only using a 15gb slice as ZIL (and the rest empty for garbage collection). I can get about 8,000 write IOPS at 4k blocks across NFS from ESX VMs - pretty damn good for a very cheap storage server that actually respects the O_sync writes. In larger blocks it can do 80-120MB/sec over NFS. By over provisioning so heavily the drives wear indicator is going to wear out at about 4.5PB written, and I'm getting near-zero write amplification - I write a lot (over NFS) to the array, but I wont wear out the media for years.

Compared to a Deneva 2 30GB SLC (no supercap - a ZFS vendor I was considering is using these :eek: ) I'm testing right now - a drive claiming a stellar 80,000 write IOPS / 550MBps ... I get slightly worse performance to the Intel 320 (5-6000 4k IOPS to VMware) at a higher cost.

Of course a RAM based drive is going to be faster, but if you're looking for good responsiveness on VMs at a reasonably low cost, I've had success with the Intel SSDs.
 
I've tested a number of SSDs now. The listed specs are not useful. The specs that they give you are for high-queue depth asynchronous writes, the exact opposite of what the ZIL does (single or low queue depth sync writes).

The best cheap, protected SSD I've used so far is the Intel 320, 300gb only using a 15gb slice as ZIL (and the rest empty for garbage collection). I can get about 8,000 write IOPS at 4k blocks across NFS from ESX VMs - pretty damn good for a very cheap storage server that actually respects the O_sync writes. In larger blocks it can do 80-120MB/sec over NFS. By over provisioning so heavily the drives wear indicator is going to wear out at about 4.5PB written, and I'm getting near-zero write amplification - I write a lot (over NFS) to the array, but I wont wear out the media for years.

Compared to a Deneva 2 30GB SLC (no supercap - a ZFS vendor I was considering is using these :eek: ) I'm testing right now - a drive claiming a stellar 80,000 write IOPS / 550MBps ... I get slightly worse performance to the Intel 320 (5-6000 4k IOPS to VMware) at a higher cost.

Of course a RAM based drive is going to be faster, but if you're looking for good responsiveness on VMs at a reasonably low cost, I've had success with the Intel SSDs.
Have you ever tried the OCZ Synapse Cache SATA III 2.5" SSD which is designed for cache ?
*NOTE: Synapse Series features 50% NAND flash overprovisioning to accommodate performance and software features = 128 GB Model : 64 GB available / 64 GB : 32 GB

Is there a list for SSD or controller with supercap ?
... I find : Intel G3, Marvell C400 or Sandforce SF2000 own supercap ; is there other controller with this capability ?

Cheers.

St3F
 
Last edited:
i just set all of the zfs folders to not use dedup, but it still looks like it's enabled. Is there something I have to do after turning it off to fully write all of the data across the drives and get rid of the dedup table?

There's no way to get rid of the dedupe data/table other than copying away, wipe and rebuild, then copy back. Dedupe runs on the fly *only*, so it cannot undedupe when you are not moving the data around. Unfortunately - I have the same problem :) However, 1.75x was quite good. I wasn't suggesting to turn it off in any case. Mine was 1.01x, which is pretty much useless.
 
Have you ever tried the OCZ Synapse Cache SATA III 2.5" SSD which is designed for cache ?
*NOTE: Synapse Series features 50% NAND flash overprovisioning to accommodate performance and software features = 128 GB Model : 64 GB available / 64 GB : 32 GB

Is there a list for SSD or controller with supercap ?
... I find : Intel G3, Marvell C400 or Sandforce SF2000 own supercap ; is there other controller with this capability ?

Cheers.

St3F

No I haven't used Synapse. This might be a decent place to get some info about various ZIL drives (not all his drives have supercaps) --

http://forums.freebsd.org/showthread.php?t=23566
 
No I haven't used Synapse. This might be a decent place to get some info about various ZIL drives (not all his drives have supercaps) --

http://forums.freebsd.org/showthread.php?t=23566

Asking "myself by me" :) ... the same questions : Faster, Higher, Stronger : powerful storage .
My use is Huge HTPC with 6x HD Stream recording and 10x stream reading, simultaneously at the same time.
... and I begin a list of SSD with supercap in the market on 2012, if you want to follow my thread.

Today I ask to resellers for quotes about STEC Zeuss product and Micron RealSSD P300

Cheers.

St3F
 
6x HD stream recoding, is not intensive at all, unless it's much much higher than normal HD stream. Is this 6x HD RAW streams?

Normal HD streams cap at 19.5mbps, and at 6x that, would be total write throughput of 15MB/sec.
 
6x HD stream recoding, is not intensive at all, unless it's much much higher than normal HD stream. Is this 6x HD RAW streams?.
The system must handle 2x HD RAW streams ... 2400 Mb/s (296,25 MB/s)
or
6x HD Stream recording simultaneously at the same time @ 120 Mb/s each ( so 90 MB/s) + still at the same time 10x stream reading @120 Mb/s each (150 MB/s)

These incompressible files are included between 20 MB and 360 MB each.

I'm thinking ZFS could not be the right way to share BIG DATA through NFS !!!!
... or NFS is not the right way to use with ZFS for performance speed with big files.
(where am I wrong ?)

I'm reconsidering the question of the NFS use !!
... Shall I have to go on CIFS / SMB ?!?

From : http://www.hob-techtalk.com/2009/03/09/nfs-vs-cifs-aka-smb
As our tests are now finished, we can say the winner is NFS: If you are reading from a server share, the results are slightly better than for CIFS. When writing to a server share, CIFS is clearly faster for all writing benchmarks.

If you are interested in the results with charts, please download this pdf

On the other hand, Andrew Galloway wrotes on 03/01/2012 "The Case Of The Mysterious ZIL Performance"
=> http://nex7.com/node/12

We were testing with a single write stream, from a single dd process(dd if=/dev/zero of=/volumes/test-pool/test-folder bs=8K count=150000, with sync=always and compress=off). It turns out that what we perceived to be honestly pretty terrible ZIL IOPS potential wasn't.. entirely accurate. You see, this logic that does a 'for' loop through each log vdev, writing and syncing and then going on to the next, will in fact quiesce any and all writes that have come in between the time he began his last commit to disk and his next one into that next one. So you see, you can only do 3000 IOPS on this particular device, but with a single stream dd it is 3000 8K IOPS. If, instead, you do say 16 threads of dd at 8K, now it is doing still 3000 IOPS, but they're 128 KB per IOP! This 'bucket' approach is what we just didn't realize happened at first, by way of our poor initial testing methodology of a single dd. A single dd writing sync to something is waiting for that acknowledge back for each 8K IOP, so it can never put more than one 8K block into each IOP alone.. you need to run multiple threads simultaneously to achieve that.

Now to be clear, 16 threads of dd at 8K is actually over 300 MB/s, which means it can't actually sustain that either, you've more than maxxed out the 3G SAS limit (which this drive is) by then, but the point is there. The mystery is solved. The ZFS ZIL can and will utilize multiple top-level vdevs defined as log devices, but it will do so in a round-robin capacity, and it will wait until it has completed a transaction against one top-level vdev before talking to the next (and while waiting, will queue up to 128K of data to write in the that next operation). If the vdevs are not very low-latency, that latency will become their Achilles' Heel, and seriously impact the ZIL (and thus also the latency of writes from all your clients). This is why utilizing devices like a STEC ZeusRAM or ZeusIOPS are so critical. Their average write latency is significantly lower than most other disks, including other SSD's (especially in the case of the ZeusRAM, since it is in actuality RAM), both in part to their great design and also because they effectively instantly answer (ignore) the sync() request ZFS does, as they have their own battery backup and can safely do that, further reducing the write latency ZFS perceives.

In summary -- this is all yet further proof that latency can be a serious killer, and that investing in some good, very low-latency log devices for workloads that have a log of ZIL traffic is absolutely critical to achieves success with ZFS, instead of introducing a bottleneck into your pool configuration. It is very fair to say that even if your chosen log device has reportedly extremely high IOPS, we'll never notice it with how we write to it (send down, cache flush, and only upon completing send down more -- as opposed to a write cache utilizing sequential write workload) if it does not ALSO have a very, very low average write latency (we're talking on the low end of microseconds, here).

Now, I'm lost ...

Cheers.

St3F
 
Last edited:
It doesn't say anything at all about how they optimized nfs stack on opensolaris at all. I found atleast for me, it falls flat on it's face when left to defaults, and you start going >300MB/sec.

The other question is, they don't say what they did for the nfs share, so I assume they went with sync mode for opensolaris.

It sounds like in your application, I wouldn't be concerned about if you loose 5seconds of data or not, from something causing it to go down. I would probably just write the video streams to via nfs with sync=disabled. Atleast for that one location.


With sync=disabled, the use of a zil for that section of the workload isn't an issue also.

I could be wrong, and you might have reasons you need that consistant down to subseconds though.
 
It doesn't say anything at all about how they optimized nfs stack on opensolaris at all. I found atleast for me, it falls flat on it's face when left to defaults, and you start going >300MB/sec.
What are your NFS optimizations ?

I would probably just write the video streams to via nfs with sync=disabled. Atleast for that one location.
You're saying, It is possible to set sync=disabled ... just for one folder (that on I read 3 big sream, and let other by nfs default ?)

Cheers.

St3F
 
There's no way to get rid of the dedupe data/table other than copying away, wipe and rebuild, then copy back. Dedupe runs on the fly *only*, so it cannot undedupe when you are not moving the data around. Unfortunately - I have the same problem :) However, 1.75x was quite good. I wasn't suggesting to turn it off in any case. Mine was 1.01x, which is pretty much useless.

I'm more worried about the performance vs space at this time. I moved the VMs to a new ZFS folder which didn't have dedup on and got rid of most of the dedup table. It looks like there is still a little bit left since it shows 1.06x. The performance is actually worse now though. I was getting about 50 IOps before and now I can only seem to get 30 max. I setup a new iSCSI volume as well and ran the same IOMeter test. The latest results were:

6.84 IOps
78ms avg read
192ms avg write

It looks like I'm going in the wrong direction...

I have dual homed both ESXi boxes to a Cisco 3750 switch so I shouldn't be having any issue on the network side.
 
I am having a slight annoyance with the xampp script. If I start the services and my server get's rebooted I have to manually turn them on again.
 
I'm more worried about the performance vs space at this time. I moved the VMs to a new ZFS folder which didn't have dedup on and got rid of most of the dedup table. It looks like there is still a little bit left since it shows 1.06x. The performance is actually worse now though. I was getting about 50 IOps before and now I can only seem to get 30 max. I setup a new iSCSI volume as well and ran the same IOMeter test. The latest results were:

6.84 IOps
78ms avg read
192ms avg write

It looks like I'm going in the wrong direction...

That is really odd to me, and not very logical to my understanding: how is a measure that relieves the system from additional work slowing down performance?

I understand you created another zfs folder on the same pool, is that correct? Have you compared the zfs folder settings other than dedupe, so you're sure everything else is the same? Was your 50-IOPS-result on NFS and your 6.x-IOPS-result on iSCSI? Don't mix iSCSI and NFS results, they will not be the same. Would you have enough disk space to create a new pool?
I don't know why it would show dedupe=1.06x when dedupe is turned off... shouldn't that be 1.00 in that case? Or do you mean it's showing 1.06x on your old ZFS folder where you turned it off?

I have dual homed both ESXi boxes to a Cisco 3750 switch so I shouldn't be having any issue on the network side.

What do you understand under "dual homed"... did you aggregate two NIC's? How is performance when you disconnect one leg and just use one Gig interface?

Cheers,
Cap'
 
OI 151a (CLI only) user here.

Can I run virtualisation straight on top of OI? Basically need a Windows machine that will act as a media server, and don't want a seperate box pointing to OI via iSCSI. Guessing I will need to carve off some space on the data pool for the VM guest(s), leaving the rest to be passed through to such guests.

This doesn't sound too promsing for an AMD user http://wiki.openindiana.org/oi/KVM!
 
Is there a way to redistribute the free space in a zpool so that it's even across drives? From what I understand, having a setup like below can be a performance bottleneck:
Code:
root@solaris:~# zpool iostat -v
               capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
pool1       2.70T  2.73T     17      0  11.6M  87.4K
  c5t3d0    2.68T  44.2G     17      0  11.6M     17
  c5t1d0    27.6G  2.69T      0      0  32.7K  87.4K
----------  -----  -----  -----  -----  -----  -----
rpool       7.72G  7.03G      0      0  1.32K    467
  c3t0d0s0  7.72G  7.03G      0      0  1.32K    467
----------  -----  -----  -----  -----  -----  -----
 
TJ, I assume this is a raid0 where you added the 2nd drive? I can't imagine any other config, just checking. Anyway, the only good way is to copy files and delete the sources. Or if you have space, zfs send/recv to a different pool, recreate the pool and copy back.
 
TJ, I assume this is a raid0 where you added the 2nd drive? I can't imagine any other config, just checking. Anyway, the only good way is to copy files and delete the sources. Or if you have space, zfs send/recv to a different pool, recreate the pool and copy back.
Not exactly raid0. I just ran out of space on my 1-drive pool, so I added a second drive. I was afraid that was going to be the answer.
 
Sorry sloppy wording on my part. That is basically zfs's version of raid0 - sorry for confusion.
 
Cap,

Sorry for the sloppy wording. I have 2 ESXi 5 U1 hosts which have 2 Intel NICs each. I have created a port channel for each of these hosts connecting to the 3750 switch. The NAS has a single 1Gb Realtec 8111 NIC. I have been through a lot of troubleshooting commands and I'm still failing to see where the issue lies. It doesn't appear to be the CPU or network. I don't see a lot of waits on the disk system.

Using esxtop I can see 60ms (minimum) latency on the guest GAVG/sec numbers which should be the sum of DAVG and KAVG but neither of those two columns have populated values to indicate whether the total latency is coming from the storage device vs vmkernel.

I figured out that the dedup value of 1.06 was due to dedup accidentally getting turned on for a separate zfs folder which is a target for Apple TimeMachine backups. I can't easily blow that one away yet.

Here is the output from a zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
rpool 149G 8.33G 141G 5% 1.00x ONLINE -
tank01 3.62T 2.39T 1.24T 65% 1.06x ONLINE -

I have 1.24T free, but I'm not sure if I can allocate that to another pool if it's already been allocated to tank01. Thanks for the continued assistance in tracking this one down.
 
I am having a slight annoyance with the xampp script. If I start the services and my server get's rebooted I have to manually turn them on again.

You may add an init script in /etc/init.d
(like the napp-it init script there)
 
I will probably just end up creating a service manifest and importing it as a service using the following command.

Code:
svccfg import manifest.xml
 
Hi there getting ready for my first ZFS build

I have 4x2TB samsung drives and 4X2TB WD Green drives

was wondering if I should do two zraid1 or put all 8 drives as one array and do zraid2

what are the pros and cons?

thanks
 
Two raidz1 pools gives you potentially better IOPS (though that would rely on IO being split across the pools).

Single client bandwidth would be better on a single raidz2, though it would probably even out on multiple clients (again assuming IO is split across the raidz1 pools).

Resilience is better on the raidz2 - it can survive any two disc failure, whereas the 2x raidz1 pools can only survive a double disk failure if it's a disk from each pool which fails.

This is all assuming similar performance from the two drive types, and no shenanigans from the WD drives around being used in a RAID setup.


Personally, all else being equal, I'd go for a single raidz2.
 
Hi there getting ready for my first ZFS build

I have 4x2TB samsung drives and 4X2TB WD Green drives

was wondering if I should do two zraid1 or put all 8 drives as one array and do zraid2

what are the pros and cons?

thanks

If you're like me (a stickler for best practices), Sun would recommend a 6 drive raid-z2 with 2 hot spares;

raid-z1 = 3 drives (2+1)
raid-z2 = 6 drives (4+2)
raid-z3 = 9 drives (6+3)

But 2 hot spares for such a smal pool would be a waste in my opinion, so I would say get another disk or two, if you can, and go z3 with/without hot spare.
 
Ideally, for maximum performance, a raidz2 should have 6 or 10 drives (any more and you should probably be considering multiple vdevs in your pool).

That said though, 8 drives will work fine - it's just that with raidz, bandwidth doesn't scale in a linear fashion. So an 8 drive raidz2 won't be 33% faster than a 6 drive raidz2 (as you'd expect with raid0) - it'll probably be a little quicker, but not that much.

FYI, raidz1 and raidz3 are similar, but for raidz1 the sweetspots are 3, 5 or 9 drives, and for raidz3 it'd be 7 or 11 drives!
 
A very valuable Solaris feature

These are snapshots (boot environments, BE). They are automatically created
on some updates so you can go back in case of problems.

Some updates activate a new BE, others keep the old one and you must activate
or select manually during boot.

available CLI commands:

show be: beadm list
create be: beadm create name
delete be: beadm destroy name
activate be: beadm activate name

Hey gea,

Thanks for your post.

Am I right in assuming then that the new boot environment is created due to an update? I didn't update anything manually myself. It seriously did 'just happen' one reboot.

How come my napp-it install is missing from the newly activated BE?
 
If you're like me (a stickler for best practices), Sun would recommend a 6 drive raid-z2 with 2 hot spares;

raid-z1 = 3 drives (2+1)
raid-z2 = 6 drives (4+2)
raid-z3 = 9 drives (6+3)

But 2 hot spares for such a smal pool would be a waste in my opinion, so I would say get another disk or two, if you can, and go z3 with/without hot spare.

thanks =) Don't know if I have enough storage on my main rig to hold all the data on the 8 drives atm when I make the array -.-
 
idmap vs guestok.

My shares are currently set to guestok=true (solely Windows clients), but I've been thinking of removing this.

I played with idmap ages ago (when I first built my ZFS NAS 2 years ago!). So I need help getting it to work.

Basically my Windows clients have a single login account (Workgroup mode), so say the user is Joe Bloggs (username = Joe), how would I get this user mapped to OI and enable access to the shares.

Some Windows clients won't have passwords on the user account (for example my HTPC). Others will (laptops).
 
Just so I'm not leaving you in the dark... I followed the troubleshooting steps on this page: http://constantin.glez.de/blog/2011/01/my-favorite-oracle-solaris-performance-analysis-commands
and got the following results:

http://pastebin.com/k7jRc0En

Thanks again.

Hey Boom,
Not that I don't want to help, but this is clearly beyond my expertise, I'm sorry. I hope one of the experts here can help any further... With what I told you before, my pool(s) are working pretty good now, though I am just taking the obvious and not digging numbers like IOPS etc. so if I can copy reasonably fast, I am happy with it :)
Cheers,
Cap
 
Hey gea,

Thanks for your post.

Am I right in assuming then that the new boot environment is created due to an update? I didn't update anything manually myself. It seriously did 'just happen' one reboot.

How come my napp-it install is missing from the newly activated BE?

BE's are created either manually or during updates.
Mostly the last BE is activated automatically but sometimes they are only created
to go back optionally.

If you are unsure about the right BE and you have a current state that is ok,
create a new BE, activate and reboot.
 
Last edited:
idmap vs guestok.

My shares are currently set to guestok=true (solely Windows clients), but I've been thinking of removing this.

I played with idmap ages ago (when I first built my ZFS NAS 2 years ago!). So I need help getting it to work.

Basically my Windows clients have a single login account (Workgroup mode), so say the user is Joe Bloggs (username = Joe), how would I get this user mapped to OI and enable access to the shares.

Some Windows clients won't have passwords on the user account (for example my HTPC). Others will (laptops).

Idmap is only usable with a Windows AD domain to map AD-users/groups to Unix users/groups.
In a workgroup a Windows SMB user on OI is the same like the Unix-User on OI,
so idmapping users results in nothing but problems.

If a Windows-user Joe needs access to OI via SMB, you need to create a user Joe on OI with the same password like on Windows.
You also need to set folder ACL and optionally share-ACL to allow Joe (or everyone@) access to the shared folder.
 
Guys,
</n00bmode on>
I'm in the middle of my build (http://hardforum.com/showthread.php?t=1702172)
I want to re-use some my drives from my old ESXi machine and give them to my Nexenta VM (which has the M1015 in passtrough mode).
These are Samsung HD103SJ (1 TB drives) and Seagate ST2000DM01 (2TB) Drives
The drives are formatted with VMFS, do I have to (pre)format them before I assign them to Nexenta or will Nexenta format them for me when I create a new mirror for example..?

</n00bmode off>
Thanks..
 
Last edited:
Withdrawn, same thing happens when running the command on a new WD20EARX. Sorry. Can't find the embarrassed smiley. :)

Hi guys, just got 5 new HD204UI/VP4s. (Hint, they are inside of these).

Question, the mfgr. date appears to be 2011.10 (oct 2011?) on the labels. From what I've been able to gather so far drives manufactured after Dec 2010 already contain the firmware patch, although the firmware version stays the same. Anyway, in preparing to put these into a new pool I did the following:

Code:
root@WKMMG3:~# uname -a
SunOS WKMMG3 5.11 oi_151a5 i86pc i386 i86pc Solaris

root@WKMMG3:~# iostat -en | grep c4t0
    0   0   0   0 c4t0d0

root@WKMMG3:~# echo | format | grep c4t0
       2. c4t0d0 <ATA-SAMSUNG HD204UI-0001-1.82TB>

root@WKMMG3:~# smartctl -d sat,12 -a /dev/rdsk/c4t0d0
smartctl 5.42 2011-10-20 r3458 [i386-pc-solaris2.11] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     SAMSUNG SpinPoint F4 EG (AFT)
Device Model:     SAMSUNG HD204UI
Serial Number:    S2K4J1CBA20247
LU WWN Device Id: 5 0024e9 005169934
Firmware Version: 1AQ10001
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 6
Local Time is:    Fri Jul 13 15:29:35 2012 MDT

==> WARNING: Using smartmontools or hdparm with this
drive may result in data loss due to a firmware bug.
****** THIS DRIVE MAY OR MAY NOT BE AFFECTED! ******
Buggy and fixed firmware report same version number!
See the following web pages for details:
http://www.samsung.com/global/business/hdd/faqView.do?b2b_bbs_msg_id=386
http://sourceforge.net/apps/trac/smartmontools/wiki/SamsungF4EGBadBlocks

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (20580) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 255) minutes.
SCT capabilities:              (0x003f) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   100   051    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0026   252   252   000    Old_age   Always       -       0
  3 Spin_Up_Time            0x0023   067   067   025    Pre-fail  Always       -       10107
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       10
  5 Reallocated_Sector_Ct   0x0033   252   252   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   252   252   051    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0024   252   252   015    Old_age   Offline      -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       6
 10 Spin_Retry_Count        0x0032   252   252   051    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   252   252   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       6
181 Program_Fail_Cnt_Total  0x0022   100   100   000    Old_age   Always       -       154
191 G-Sense_Error_Rate      0x0022   100   100   000    Old_age   Always       -       3
192 Power-Off_Retract_Count 0x0022   252   252   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0002   064   061   000    Old_age   Always       -       26 (Min/Max 18/39)
195 Hardware_ECC_Recovered  0x003a   100   100   000    Old_age   Always       -       0
196 Reallocated_Event_Count 0x0032   252   252   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   252   252   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   252   252   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0036   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       30
223 Load_Retry_Count        0x0032   252   252   000    Old_age   Always       -       0
225 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       10

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


Note: selective self-test log revision number (0) not 1 implies that no selective self-test has ever been run
SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Completed [00% left] (0-65535)
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

root@WKMMG3:~# iostat -en | grep c4t0
    1   0   0   1 c4t0d0

root@WKMMG3:~# tail /var/adm/messages
Jul 13 15:27:12 WKMMG3 scsi: [ID 583861 kern.info] sd11 at marvell88sx0: target 7 lun 0
Jul 13 15:27:12 WKMMG3 genunix: [ID 936769 kern.info] sd11 is /pci@0,0/pci10de,376@a/pci8086,32c@0/pci11ab,11ab@0/disk@7,0
Jul 13 15:27:12 WKMMG3 scsi: [ID 583861 kern.info] sd22 at marvell88sx1: target 7 lun 0
Jul 13 15:27:12 WKMMG3 genunix: [ID 936769 kern.info] sd22 is /pci@0,0/pci10de,376@a/pci8086,32c@0/pci11ab,11ab@4/disk@7,0
Jul 13 15:29:35 WKMMG3 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci1043,8239@5,1/disk@0,0 (sd2):
Jul 13 15:29:35 WKMMG3  Error for Command: <undecoded cmd 0xa1>    Error Level: Recovered
Jul 13 15:29:35 WKMMG3 scsi: [ID 107833 kern.notice]    Requested Block: 0                         Error Block: 0
Jul 13 15:29:35 WKMMG3 scsi: [ID 107833 kern.notice]    Vendor: ATA                                Serial Number:
Jul 13 15:29:35 WKMMG3 scsi: [ID 107833 kern.notice]    Sense Key: Soft_Error
Jul 13 15:29:35 WKMMG3 scsi: [ID 107833 kern.notice]    ASC: 0x0 (no additional sense info), ASCQ: 0x0, FRU: 0x0

Question, is this an indication that the drive is running into the firmware bug smartmontools describes ?
 
Last edited:
I'm trying to determine if my M1015 is DOA or not, when I boot a recent linux I get a *ton* of modprobe and kernel errors stemming from the megasas driver. Anyone else seeing similar behavior?

In particular I am interested in anyone who has an Intel C204 based motherboard and Sandy Bridge or Ivy Bridge Xeon - bonus points if you have Tyan SS5512 based board - can try booting the Clonezilla live cd and tell me if you can get to the GUI cleanly, or if you see a ton of 'udevd: timeout: killing `/sbin/modprobe ...' messages.

Huge thanks in advance!!

(Also if anyone is in the bay area and has a card I could borrow/try to see if mine is DOA or not I'd be happy to buy you lunch!)
 
I'm trying to determine if my M1015 is DOA or not, when I boot a recent linux I get a *ton* of modprobe and kernel errors stemming from the megasas driver. Anyone else seeing similar behavior?
......

can you post modprobe & kernel errors ?
just some lines not duplicate errors/warnings.
 
Withdrawn, same thing happens when running the command on a new WD20EARX. Sorry. Can't find the embarrassed smiley. :)

Hi guys, just got 5 new HD204UI/VP4s. (Hint, they are inside of these).

Question, the mfgr. date appears to be 2011.10 (oct 2011?) on the labels. From what I've been able to gather so far drives manufactured after Dec 2010 already contain the firmware patch, although the firmware version stays the same. Anyway, in preparing to put these into a new pool I did the following:

Code:
root@WKMMG3:~# uname -a
SunOS WKMMG3 5.11 oi_151a5 i86pc i386 i86pc Solaris

root@WKMMG3:~# iostat -en | grep c4t0
    0   0   0   0 c4t0d0

root@WKMMG3:~# echo | format | grep c4t0
       2. c4t0d0 <ATA-SAMSUNG HD204UI-0001-1.82TB>

root@WKMMG3:~# smartctl -d sat,12 -a /dev/rdsk/c4t0d0
smartctl 5.42 2011-10-20 r3458 [i386-pc-solaris2.11] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     SAMSUNG SpinPoint F4 EG (AFT)
Device Model:     SAMSUNG HD204UI
Serial Number:    S2K4J1CBA20247
LU WWN Device Id: 5 0024e9 005169934
Firmware Version: 1AQ10001
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 6
Local Time is:    Fri Jul 13 15:29:35 2012 MDT

==> WARNING: Using smartmontools or hdparm with this
drive may result in data loss due to a firmware bug.
****** THIS DRIVE MAY OR MAY NOT BE AFFECTED! ******
Buggy and fixed firmware report same version number!
See the following web pages for details:
http://www.samsung.com/global/business/hdd/faqView.do?b2b_bbs_msg_id=386
http://sourceforge.net/apps/trac/smartmontools/wiki/SamsungF4EGBadBlocks

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (20580) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 255) minutes.
SCT capabilities:              (0x003f) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   100   051    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0026   252   252   000    Old_age   Always       -       0
  3 Spin_Up_Time            0x0023   067   067   025    Pre-fail  Always       -       10107
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       10
  5 Reallocated_Sector_Ct   0x0033   252   252   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   252   252   051    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0024   252   252   015    Old_age   Offline      -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       6
 10 Spin_Retry_Count        0x0032   252   252   051    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   252   252   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       6
181 Program_Fail_Cnt_Total  0x0022   100   100   000    Old_age   Always       -       154
191 G-Sense_Error_Rate      0x0022   100   100   000    Old_age   Always       -       3
192 Power-Off_Retract_Count 0x0022   252   252   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0002   064   061   000    Old_age   Always       -       26 (Min/Max 18/39)
195 Hardware_ECC_Recovered  0x003a   100   100   000    Old_age   Always       -       0
196 Reallocated_Event_Count 0x0032   252   252   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   252   252   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   252   252   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0036   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       30
223 Load_Retry_Count        0x0032   252   252   000    Old_age   Always       -       0
225 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       10

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


Note: selective self-test log revision number (0) not 1 implies that no selective self-test has ever been run
SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Completed [00% left] (0-65535)
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

root@WKMMG3:~# iostat -en | grep c4t0
    1   0   0   1 c4t0d0

root@WKMMG3:~# tail /var/adm/messages
Jul 13 15:27:12 WKMMG3 scsi: [ID 583861 kern.info] sd11 at marvell88sx0: target 7 lun 0
Jul 13 15:27:12 WKMMG3 genunix: [ID 936769 kern.info] sd11 is /pci@0,0/pci10de,376@a/pci8086,32c@0/pci11ab,11ab@0/disk@7,0
Jul 13 15:27:12 WKMMG3 scsi: [ID 583861 kern.info] sd22 at marvell88sx1: target 7 lun 0
Jul 13 15:27:12 WKMMG3 genunix: [ID 936769 kern.info] sd22 is /pci@0,0/pci10de,376@a/pci8086,32c@0/pci11ab,11ab@4/disk@7,0
Jul 13 15:29:35 WKMMG3 scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci1043,8239@5,1/disk@0,0 (sd2):
Jul 13 15:29:35 WKMMG3  Error for Command: <undecoded cmd 0xa1>    Error Level: Recovered
Jul 13 15:29:35 WKMMG3 scsi: [ID 107833 kern.notice]    Requested Block: 0                         Error Block: 0
Jul 13 15:29:35 WKMMG3 scsi: [ID 107833 kern.notice]    Vendor: ATA                                Serial Number:
Jul 13 15:29:35 WKMMG3 scsi: [ID 107833 kern.notice]    Sense Key: Soft_Error
Jul 13 15:29:35 WKMMG3 scsi: [ID 107833 kern.notice]    ASC: 0x0 (no additional sense info), ASCQ: 0x0, FRU: 0x0

Question, is this an indication that the drive is running into the firmware bug smartmontools describes ?

OMG going to order some LOL

wd red does look good but cost too much

btw

how do you guys split your drives? or do you even

8x2TB = 12TB~ (zraid2) would you guys use it as one large 16 TB drive or partition it

using my server as storage and running a windows VM on it for the time being

also I have two crucial m4 64 gb

option 1
use one for open indiana OS and use the other one as cache

option 2
use both for OS but do raid1

thanks
 
Last edited:
Back
Top