I think OmniOS is the way to go if you want stable server OS.
Regarding my post about iSCSI target dying. I guess I spoke too soon. It just crashed 4 times today. Will report when I'll have some more info.
Matej
I didn't change any settings on the server (target), but I must set higher timeouts on clients (initiators) because of the way we use iSCSI, which is not ideal (remote clients, sometimes high latency,...) and lower timeout lead to iSCSI drives dropping. I know iSCSI wasn't made for that kind of...
Interesting limits. Wonder why there isn't a bigger default limit on systems with more memory. In my case, time that was needed to free ram cost us crash of a service. Unfortunately my graphs are already aggregated and I can't look back at them to see if there was a momentary lack of free...
Glad I can be of help.
Currently I did "over the finger" limit and just limited my ARC to 200GB.
Looking at my graphs, I usually had around 10GB of free memory and now I have around 18GB after ARC is filled.
Matej
About a year ago, I was writing about my problems with ZFS pool freezing and iSCSI target dying.
On random and towards the end, at the same time every week, our iSCSI target died (svc reported service running, but there was nothing listening on 3260 port) and we were unable to write to pools...
I guess I will have to do some tests on my own. I want to fully test it before I deploy it to a production SAN. Don't want any troubles with 400TB of data, it takes ages to restore:)
Matej
I have around 100 4TB SATA Constellation drives and around 200 4TB & 6TB SAS Constellation drives running over 4 years and I think 2-5 have failed in that time.
Matej
Hey there!
I was wondering what is currently the recommended version of LSI SAS2 firmware? Some say P15, other P18 and some P19. What do you use? Anyone updated to the updated (??) P20?:)
Is anyone having problems with iSCSI being randomly disconnected (I notice this mostly with Windows...
You set it when you create LU:
$ stmfadm create-lu -p blk=512
Check stmfadm create-lu -?
You can see the current size with:
$ stmfadm list-lu -v
lp, Matej
So far no problems. In the mean time, one drive did failed, but it was easy to find:
- drive was 100% busy even then there was almost no traffic
- SMART showed 'Drive failing'
Other drives are humming along nicely and they took some beating by now with various test scenarios...
Matej
I'm running one server with 256GB memory for cca 2 years. We have some troubles, but they are not connected to memory.
I'll start a new cluster in a month, with both nodes having 256GB memory. Ping me after half a year and I can report:)
Matej
We have a server with 256GB running for 2 years without a problem.
I also talked about that with OmniTI 2 days ago and they said that there should be no problem with that much memory. They probably run their production server with even more than 265GB of memory.
Matej
I did enable ARC on the tests and it might be that data was not in the ARC. I should run the same test twice or more time, to eliminate reading (I have enough memory to cache everything).
On the other side, I has recordsize set to 4k, so there shouldn't be any RWM. It could be that blocks...
The thing with fio is, that I'm also writing 4k blocks. But for an unknown reason, it is still doing RWM (I guess).
I will turn on ARC cache, so reads will be eliminated and check again...
Matej
danswartz: I will try this ramdrive solution today. I tried using ram as a zil device yesterday in linux, but after creating a ram block device, I did not see any traffic over it via iostat. I will try today in OmniOS and report back.
HammerSandwich: I could do that and I will, when I get back...
Gea:
For test, I added another 10 drives raidz2 vdev, so now I have the following config:
pool0 pool:
* 10 drives raidz2
* 10 drives raidz2
* log mirror ZeusRAM
If I'm writing at 48k IOPS@4k, that is around 180MB/s.
Both vdevs combined should be able to write sequentially with at least...
Nope, CPU is not bottleneck.
I switched to linux today.
I tested 4k random sync write directly to a Zeusram: 48kIOPS -- OK
I tested 4k random sync write to a mdadm raid1 build with 2 ZeusRAMs: 48kIOPS -- OK
Then I created a ZFS raidZ2 pool with 10 drives and 2 ZeusRAM in mirror as ZIL...
vektor777:
I have 2 pools. My firmware revision is C025.
Pool1:
NAME STATE READ WRITE CKSUM
data ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
c6t5000C500836B2889d0...
Yea, their SATA SMART is... hard to read?:)
When "Elements in grown defect list" starts to grow, SMART probably also raises an alarm. I will create a nagios check and collectd script anyway, so I won't be surprised one day:)
Matej
Anyone using ZeusRAM for ZIL? What are your write/read performance for 4k random write?
I have a RaidZ2 pool with 10x4TB SAS drives and ZeusRAM for ZIL and when testing with fio, I only get around 2500iops, which is way too little I think.
I also tried creating a mirror pool with 2xZeusRAM...
Hello again...
I got an answer back from Seagate saying that they don't filter what disk reports back, hence so many errors. They are raw values. All is good and we decided to keep the SAS3 drives.
As far as price goes, SAS drives are not that much expensive. We pay cca 10eur more for SAS...
Ok, I will look at Elements in grown defect list, but so far all hard drives have value of 0 - as one would expect for new drives:)
It's interesting that there are SO many errors. 25 000 000 errors when reading 10GB of data seems like A LOT. If I run the same transfer on a HGST drive, counter...
Today I tested the same drive in 3 different JBODs with 2 different servers and 3 different controllers (but all same brand/model/firmware).
I remembered I also have a brand new IBM server in the rack with 3.5" hard drives. I plugged my Seagate in and powered on. When system booted, I did...
The output from smartctl -l error is the same as the output with -a command.
I can see verify row on Hitachi drive, but not on Seagate.
I will try some more configurations todat, but I think all will yeald the same problems. I have 3 more JBODs and 1 more controller I can test on. I'm waiting...
Hey there,
thanks for your help and input.
smartctl -A returns even less than -a:
smartctl -A /dev/sdb
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.10.0-229.14.1.el7.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ...
Hello!
I have the following new setup:
server with LSI 9207 HBA firmware P19
Supermicro 837E26-RJBOD1 28bay JBOD
I'm having troubles with high Read errors corrected by ECC counter in SMART.
Writing to hard drives gives no errors, but as soon as I start to read data from hard...
I guess I spoke too soon:)
It crashed this night. Although it looks like only iscsi target froze this time, not the FS itself. At least nagios did not report problems...
Matej
We managed to change all hard drives, JBOD cases, servers, LSI HBA cards, SAS cables, upgraded OmniOS to latest x14 version,... Actually, we change all the hardware, we changes pool design(from a single 50 drive pool to a 7vdev x 10drives pool), we even changed datacenters and switch, yet we...
I am leaning towards UniFI as well. Actually, I can use WRT54GL as router and add UniFi as AP.
I can later upgrade WRT to EdgeRouter. I have it at home and it's quite nice, although I miss some features in web interface. Not much of a problem for me, since I mainly configure it in CLI, but some...
Hello!
We are in need of a new router or AP. Currently we are using the old faithful WRT54GL but it is slowly showing signs of age...
So we are looking for something new:
- around 15 clients over wifi
- 4-5 clients over LAN (we have another switch)
- AC support
- PPPoE support
What...
A little late reply from my side.
Thanks for info. I did some more reading in the mean time and figured the same things you pointed out...
I won't go with ZeusRAM, because it's too expensive for us. A good SAS SSD will have to do. Probably something like HGST SSD800MH.B or Seagate S1200...
Hello!
I already have a topic on servethehome, but I said I'll ask here as well and maby get some more info.
So, we are rebuilding our OmniOS SAN and will build a HA cluster with the help of RSF-1 software. For us to be able to connect out JBODs to 2 controllers, we need a dual-port...