Such a thing as too much redundancy/overplanning?

The Spyder

2[H]4U
Joined
Jun 18, 2002
Messages
2,628
I have been researching and designing a network backend for a ESXi server deployment that will be taking place this fall. I am at a point where I am just out of my comfort zone and am hoping for some input from the talented folks here. I should have some drawings up later, but for now maybe someone can look over my ramblings and provide some tips. (This is defiantly a "It's 3:30am, my head hurts, time to get it typed out and get the thoughts out of my head and go to bed," moment.) I am getting the final proposal together and want to make sure I have the correct NIC's + ect spec'd.

The hardwaare looks like this:
2 Dell R710 ESXi hosts
1 Ibm X3550 management server
1 ZFS SAN (Custom Supermicro)
1 ZFS Backup SAN (R710)
2 HP 1800-24g switches
(1 HP j series chassis for the main switch in another room)

After reading hundreds of pages over the last 3 days, I have come up with the following lay out.

Lets start with the switches. The two hp 1800's being Switch A and B respectively and the J series being C. The 1800 have a limited feature set and do not support spanning tree and can not be looped. This leaves me with the following 2 2gb trunks:

.................Switch C
...................^......^
.................^.........^
...............^............^
.............^...............^
Switch A.................Switch B

This way I never loop A to B directly, using C as the middle man. The single point of failure is Switch C in this case, and we have ordered a spare to keep in storage.

Switch A and B host the following VLANs, C just having access to VLAN 1.
VLAN1 general traffic (Open to all)
VLAN2 storage traffic (Closed to all but ESXi hosts and both ZFS servers)
VLAN3 management (Closed to all but servers and management server + service vpn)



Next up is the primary ZFS server:
There are a dozen different ways I could do this, I even have thought of providing each of the data stores their own redundant connection (due to esxi only allowing 1 adapter per data store with an additional for fallover), but it is just too complicated. If I need the speed for the storage Data pool, I can always switch the gear to 10gb for the storage network.
However I have come to believe this is the best initial setup:
2 Quad Port 1gb Intel Nics, 2 onboard, 1 kvm.
ZFS server...........................................Switch A....................Switch B
Nic 1 Port 1)ZFS share trunk A..............Port 1) ZFS trunk A.....Port 1) ZFS trunk B
Nic 1 Port 2)ZFS share trunk B..............Port 2) ZFS trunk a......Port 2) ZFS trunk b
Nic 2 Port 1)ZFS share trunk a...............Port 3) Backup ZFS......Port 3) Backup ZFS fallover
Nic 2 Port 2)ZFS share trunk b
Nic 1 Port 3)backup zfs trunk
Nic 2 Port 3)backup zfs trunk fallover
Nic 1 Port 4) Spare
Nic 2 Port 4) Spare

Onboard 1) Management network
Onboard 2) Spare
KVM 1) Management

This gives me a 2gb redundant trunk between the ZFS box and switch, meaning I could loose a nic, cable, or switch and still maintain a link. The spare ports allow for creating a second link for a datastore or to the backup server/management server.

The Backup ZFS server would look like this:
2 Dual Port Intel Pro 1000, 2 onboard, 1kvm
Nic 1 Port 1) Backup ZFS trunk
Nic 1 Port 2) Spare
Nic 2 Port 1) Backup ZFS trunk
Nic 2 Port 2) Spare

Onboard 1) Management
Onboard 2) Spare
KVM 1) Management

Same idea as above, I could loose a cable, nic, or switch and still have a connection.

Moving to the ESXi hosts:
Using the above configuration, each server would have 2 dual port Pro 1000 nics, 2 onboard, 1 KVM.
Nic 1 Port 1) ZFS
Nic 1 Port 2) General (VLAN 1)
Nic 2 Port 1) ZFS fallover
Nic 2 Port 2) General (VLAN 2)

Onboard 1) Management
Onboard 2) Spare
KVM 1) Management

The management box is the simplest:
2 onboard, 1 KVM
Onboard 1) Management
Onboard 2) General
KVM 1) Management


Well, just writing this was well worth it. I spent about 2 hours thinking about the design and removed the triple data store links + redundancy, I just don't think we can use the speed and esp with how small the network is right now. Plus it would only work for redundancy if I could put one end of each trunk of 2 on a separate switch and have them be able to handle that. Maybe these HP do, but I have not looked in to it yet.

My switches end up with the following port lay outs:
12 Management (( 6 per switch) 8 server, 4 add devices)
4 Uplink trunks ( 2 per switch)
12 Storage (6 per switch)
6 General ( 3 per switch)

And plenty of room to grow until we can afford new 10g gear.

Anyways, suggestions are always welcome.
 
How many users and what type of business? How valuable is their information and uptime to them?

Those questions will answer whether it is worth it IMO.

I like your plan so far
 
One thing too, test it afterwards. It's one thing to have redundancy, but it's even better when you know it works. :D Setup some test VMs, pull plugs etc. Do real life tests before it goes live.

Where I work we were never given that chance, turns out our redundant iSCSI setup is not so redundant. Now that it's live we're not really allowed to try to fix it either.
 
Are you using a licensed version of ESXi or free? If you're using VMware HA you want to have redundant management connections else the box could become isolated and invoke HA.

Edit: And next time draw a logical diagram. ;)
 
Good to know I have not gone crazy. I think a link per datastore is out of the question right now, just adding an additional redundant trunk+ esxi link with fallover would take up 6 more ports and require 2 more quad ports nics. I would at that point delicate 2 more switches purely to the SAN and 2 for the servers/maintenance.

The environment is small right now, 40 users with 60 or so devices. The likely hood that they will double in size over the next 3 years, if not triple is huge. They just looked at new building down the street from my house, which would allow them to have at least triple the devices and double the staff. This is a
Planned server setup looks like this:
Server 2k8r2 DNS/DHCP/AD
Server 2k8r2 Fileserver Backup DNS/DHCP/AD
Server 2k8r2 Exchange server
Server 2k8r2 App server
Server 2k8r2 Database server for a in house barcoding+ access control system
Nagios CentOS5 box
Server 2k8r2 FTP server, Printer server

Server 2k8r2 management server (on its own box)
Open Indiana ZFS
Open Indiana Backup ZFS

Free version of ESXi for starting out. I plan to test everything possible :) Minimal downtime (if I loose 15-30 minutes switching things around/over no one is going to get fired.

I am just finishing up some visio drawings, I will post those soon.
 
Just added everything up, I need something like 34 network cables + 4 for the main link. Damn.
 
never! you can ever have too much redundancy. You can have too much expenditure though!
 
Haha, very true.

On page 15 of Visio drawings... guessing I will have about 20 all said and done. I have broken everything down I can think of.
 
Haha, very true.

On page 15 of Visio drawings... guessing I will have about 20 all said and done. I have broken everything down I can think of.

If you are seriously on page 15 of a drawing for this extremely simple configuration you're doing it VERY wrong. At most for this you'd need 2. A logical diagram, and a physical diagram. That's it.

What you're doing isn't very complex.
 
If you are seriously on page 15 of a drawing for this extremely simple configuration you're doing it VERY wrong. At most for this you'd need 2. A logical diagram, and a physical diagram. That's it.

What you're doing isn't very complex.

I am drawing everything... from AD to the rack and the server cabling.
 
Oh at about 3am when I can't sleep due to allergies/back issues from a car accident. Plus I am working with the onsite tech who handles my day to day stuff. Drawings have made it 100x easier to explain how and why things are setup certain ways, esp to someone who had only every installed vmware workstation and tried to open a VM machine image I made 2 years ago.
 
Back
Top