What a nightmare, luckily I have no data on there yet!
ESXi 5.5 on a Supermicro X9SRH-7TF with 24GB ECC RAM, the onboard LSI 2308 IT in passthrough mode.
32GB and 64GB SLC SSDs
Six (new disks in vacuum-sealed bags dated Jan 2011) Hitachi HDS5C302-A580 2TB drives (that have been lying around gathering dust for the best part of two years, shame on me) connected with whatever SATA cables I could find (I did suspect dodgy cables and swapped them round a bit and added a few new ones, no luck there)
So I setup Solaris 11.1 with the instructions here http://www.abisen.com/lsi2308-with-solaris11.1.html, all working, until I created a pool from the CLI, added napp-it and ran filebench, then all hell broke loose… probably important is the fact I stupidly accepted the ESX default of 3GB RAM for the 11.1 VM!
I setup like this:
zpool create raidarray mirror c0t5000CCA369C5BAA6d0 c0t5000CCA369C66F1Ed0 mirror c0t5000CCA369C70ED5d0 c0t5000CCA369C728F2d0 mirror c0t5000CCA369C72952d0 c0t5000CCA369C72955d0
Worked fine, then when running filebench, the console and UI slowed, probably the beginnings of memory exhaustion, then the zfs info was showing the first vdev as degraded, so I cleared the fault and scrubbed it, very bad idea the console and the UI froze completely so I had to do a power cycle, while it was down I cranked the RAM to 8GB, solaris wouldn’t boot unless I loaded with the pre-napp-it solaris grub entry, when loaded the raidarray was missing and no matter what I did I couldn’t get the same six disk striped mirror back again. At some point before the power cycle I caught a glimpse of read and write errors on both disks in the first vdev mirror c0t5000CCA369C5BAA6d0 and c0t5000CCA369C66F1Ed0, bugger!
So I tried Ubuntu, not happy with two of the six disks, because the naming and positioning was different I can only assume it’s the same ones, SMART checked out on all but one disk.
FreeNAS, same not happy with two of the six disks, naming is different again and so is positioning.
I gave tried fdisk, parted, gparted, format…most of the tools just froze on the suspect disks and I could see errors in the dmesg... I have hit a brick wall for today.
What are the chances of two new HDDs out of six being DOA, what should I do to try and troubleshoot from here, I will take all the disks out and hook them to a standard SATA port one by one and run some stuff on them, what OS, what tools?
In coming totally clean the disks aren’t properly physically mounted yet, I created a basic aluminium frame out of makerbeam and slotted the six disks in there, with gaps in between them and a 140mm fan blowing air through the gaps, death by combined vibration did pop in to my mind but surely it wouldn’t happen after a lot less than 24 hours total run time?
Please help.
Cheers
Richard
ESXi 5.5 on a Supermicro X9SRH-7TF with 24GB ECC RAM, the onboard LSI 2308 IT in passthrough mode.
32GB and 64GB SLC SSDs
Six (new disks in vacuum-sealed bags dated Jan 2011) Hitachi HDS5C302-A580 2TB drives (that have been lying around gathering dust for the best part of two years, shame on me) connected with whatever SATA cables I could find (I did suspect dodgy cables and swapped them round a bit and added a few new ones, no luck there)
So I setup Solaris 11.1 with the instructions here http://www.abisen.com/lsi2308-with-solaris11.1.html, all working, until I created a pool from the CLI, added napp-it and ran filebench, then all hell broke loose… probably important is the fact I stupidly accepted the ESX default of 3GB RAM for the 11.1 VM!
I setup like this:
zpool create raidarray mirror c0t5000CCA369C5BAA6d0 c0t5000CCA369C66F1Ed0 mirror c0t5000CCA369C70ED5d0 c0t5000CCA369C728F2d0 mirror c0t5000CCA369C72952d0 c0t5000CCA369C72955d0
Worked fine, then when running filebench, the console and UI slowed, probably the beginnings of memory exhaustion, then the zfs info was showing the first vdev as degraded, so I cleared the fault and scrubbed it, very bad idea the console and the UI froze completely so I had to do a power cycle, while it was down I cranked the RAM to 8GB, solaris wouldn’t boot unless I loaded with the pre-napp-it solaris grub entry, when loaded the raidarray was missing and no matter what I did I couldn’t get the same six disk striped mirror back again. At some point before the power cycle I caught a glimpse of read and write errors on both disks in the first vdev mirror c0t5000CCA369C5BAA6d0 and c0t5000CCA369C66F1Ed0, bugger!
So I tried Ubuntu, not happy with two of the six disks, because the naming and positioning was different I can only assume it’s the same ones, SMART checked out on all but one disk.
FreeNAS, same not happy with two of the six disks, naming is different again and so is positioning.
I gave tried fdisk, parted, gparted, format…most of the tools just froze on the suspect disks and I could see errors in the dmesg... I have hit a brick wall for today.
What are the chances of two new HDDs out of six being DOA, what should I do to try and troubleshoot from here, I will take all the disks out and hook them to a standard SATA port one by one and run some stuff on them, what OS, what tools?
In coming totally clean the disks aren’t properly physically mounted yet, I created a basic aluminium frame out of makerbeam and slotted the six disks in there, with gaps in between them and a 140mm fan blowing air through the gaps, death by combined vibration did pop in to my mind but surely it wouldn’t happen after a lot less than 24 hours total run time?
Please help.
Cheers
Richard
Last edited: