Nutanix CE? What are you running?

Outlaw85 · Sep 5, 2019

While searching for the next "opportunity". I've been wanting to get Nutanix CE running in a lab to tinker around and keep learning.

I know the min physical specs:
4 cores
16gb memory
Intel NIC
2 drives minimum (1ssd, 1hdd) (max 4 drives)
+200gb for hot tier
satadom/8gb usb minimum for os/boot

I'm wondering if anybody here is running it and what they are using.

Ideally, I to spec out at least 3 of the same machines/builds. ITX (or even m-ATX) would be cool to build into a single server box but probably cost-prohibitive.

CPU- L5638 I'm hoping would work. They are older but the price is right and have a couple of extra cores with lower power consumption.
MEM- Depends on new board. The board I have (S5500bc) now only supports 32gb and 2x L5638. May still work though. Looking at the prices... dang
NIC- Shouldn't be an issue, I have intel cards and if it's an intel board, I'm hopeful they sourced their own nic's.
DISKS- Looks like 240gb SSDs are pretty cheap now, this should meet the hot tier requirement. I have a few 1tb hdd's for cold tier.
OS/BOOT- Like the hot tier, the drives/USB are pretty cheap. not really an issue.

I had a few vms running in my ESXi lab including plex

relapse808 · Sep 11, 2019

I have not tried CE. I am currently migrating to Nutanix/Ahv from VMware at work and I don't know why anyone would pick Acropolis over VMware.

Shockey · Sep 11, 2019

relapse808 said:
I have not tried CE. I am currently migrating to Nutanix/Ahv from VMware at work and I don't know why anyone would pick Acropolis over VMware.

$$$$

Outlaw85 · Sep 11, 2019

relapse808 said:
I have not tried CE. I am currently migrating to Nutanix/Ahv from VMware at work and I don't know why anyone would pick Acropolis over VMware.

Shockey said:
$$$$

^^^ This

At my last job, the licensing of VMware was in the millions. AHVs cost is included with the hardware purchase. I haven't gotten to dig into CE or AHV so I can't comment on pros/cons but the cost alone is what makes a company switch, not the supportability by the engineers.

relapse808 · Sep 11, 2019

Yeah it was a $ thing. Acropolia has a lot of catching up to do.

Outlaw85 · Sep 28, 2019

I'm not sure there is enough interest here but just got a single node installed tonight. Not the most efficient system to run it, but can at least play now.

2x Xeon 5540
32GB memory (board max and minimum recommended due to CVM)
1x 250GB EVO 850 SSD for hot tier
1x 1TB WD Green HDD for cold tier
1x 64GB satadom for OS

Oh and you need a NEXT account to keep CE active.

And tada!

Vengance_01 · Mar 15, 2020

I wonder if the cheap Wish X79 boards, a cheap v2 6 core 12 thread and 16GB ram would work. They have 4 dimms and room for a sata 3 power M2 SSD slot.

Outlaw85 · Mar 18, 2020

Vengance_01 said:
I wonder if the cheap Wish X79 boards, a cheap v2 6 core 12 thread and 16GB ram would work. They have 4 dimms and room for a sata 3 power M2 SSD slot.

I just happened to check if there were any updates and was planning to add another one.

If the boards I found are the ones you are talking about, I would load them full (64GB max). Depending on what you want to do, resources go way to fast and it's reserved. Also, get as many cores as you can since it's a single socket and each board (node) requires a CVM at a minimum of 4vcpu and 16gb.

Since my post above, I've deleted and redeployed CE on the same box. I was able to get Files installed but not tested anything yet. And now trying to get Prism Central working but 2 problems..
- I didn't have enough resources on the AHV CE node so I had to deploy in the ESX cluster AFTER changing the OVA to SHA1.
- Because it's not "true" AHV, PC doesn't like it. Having to go through different versions in hopes of it working. More research needed here to find if there's just something to change in a config

Here's my single node today (note- there are no actual workloads, just a Files VM(4x12) and CVM (4x16):

**Update**
Pretty sure I would be able to modify the json for PC to work in AHV CE but I don't have enough resources and deploying PC manually through an OVA doesn't have the same json file requirement. Looks like I may not be able to have PC until I get something better to run on.

relapse808 · Mar 19, 2020

I took a look at my environment and PC is a pretty hefty VM.

Outlaw85 · Mar 19, 2020

relapse808 said:
I took a look at my environment and PC is a pretty hefty VM.

That looks like the large PCVM for 5-25k VMs. The small is 4vcpu and 16gb and supports upto 12.5k VMs.
This is the OVA deploy. You don't get to set anything

MrGuvernment · Mar 19, 2020

relapse808 said:
I have not tried CE. I am currently migrating to Nutanix/Ahv from VMware at work and I don't know why anyone would pick Acropolis over VMware.

Licensing costs, that is pretty much the only reason. If you went Nutanix with VMware, you may as well of just bought Dell or HPE, and Nutanix is basically a software company now anyways using OEMs for hardware.

Outlaw85 · Mar 19, 2020

MrGuvernment said:
Licensing costs, that is pretty much the only reason. If you went Nutanix with VMware, you may as well of just bought Dell or HPE, and Nutanix is basically a software company now anyways using OEMs for hardware.

This is true, they went to Software and subscription/capacity based licensing. They are still partnered with Supermicro for their OEM (Single support call for hardware and software) and have started a partnership with HPE (single initial call but separate support for hardware and software) but otherwise are hardware agnostic for most things. From my experience, the reason to pick Nutanix with VMware is to trial AHV/HCI without quiting vmware/hyper-v/kvm cold turkey. At least with vmware, if you do decide to make the switch to AHV, you can do that fairly easily.

Vengance_01 · Mar 20, 2020

All good stuff. Trying to build a cheap home lab to start messing with this stuff. I am getting rusty since at my current job we are 100% in azure for all infrastructure along with Palo Alto VMs for controlling traffic between subnets vs azures built in shit.

My background has always been with hardware. Last job I helped manage 60 esxi hosts and a fairly large ucs blade infrastructure

relapse808 · Mar 20, 2020

Outlaw85 said:
This is true, they went to Software and subscription/capacity based licensing. They are still partnered with Supermicro for their OEM (Single support call for hardware and software) and have started a partnership with HPE (single initial call but separate support for hardware and software) but otherwise are hardware agnostic for most things. From my experience, the reason to pick Nutanix with VMware is to trial AHV/HCI without quiting vmware/hyper-v/kvm cold turkey. At least with vmware, if you do decide to make the switch to AHV, you can do that fairly easily.

You said moving from ESX to AHV is easy? I have moved about 500 machines myself and using Nutanix Move is pretty easy to do as long as your machines are Server 2008R2 or above or RHEL 6.3 and above. If you have servers or systems below that it can become a real pain in the ass with the disk only migrations and manual install of legacy VirtIO.

bman212121 · Mar 20, 2020

I'd hope your machines are Server 2012 and above otherwise you shouldn't be bothering to move them. I get client side issues and keeping XP VMs around because they are air gapped, but usually if you need server OS it's because you're trying to host something for other systems to connect to.

But I'd definitely like to hear more followups about Nutanix. I'm not so much concerned about their own appliances but basically their software. Is anyone using the Prism software? Running Prism for management but then AHV, ESXi, or HyperV underneath it?

bman212121 · Mar 20, 2020

Vengance_01 said:
All good stuff. Trying to build a cheap home lab to start messing with this stuff. I am getting rusty since at my current job we are 100% in azure for all infrastructure along with Palo Alto VMs for controlling traffic between subnets vs azures built in shit.

My background has always been with hardware. Last job I helped manage 60 esxi hosts and a fairly large ucs blade infrastructure

I wouldn't say that's rusty at all. If anything you probably have a leg up on a lot of people. The general trend I'm seeing is less and less on prem hardware, and more hosted. If you were to change jobs I'd guess they would be more inclined to hire someone who has cloud migration and maintenance experience than someone who has physical on prem experience. The physical on prem is useful if you want to work in a colo, but the average company probably less so. It kind of makes me a bit sad though because I do prefer working with physical hardware over software, but a typical job seems to require that less and less.

Outlaw85 · Mar 20, 2020

Vengance_01 said:
All good stuff. Trying to build a cheap home lab to start messing with this stuff. I am getting rusty since at my current job we are 100% in azure for all infrastructure along with Palo Alto VMs for controlling traffic between subnets vs azures built in shit.

My background has always been with hardware. Last job I helped manage 60 esxi hosts and a fairly large ucs blade infrastructure

It sounds cool. I never got into the cloud side. I was one of the SE's managing the hodge podge of mixed infra onsite but Nutanix/PE was my main focus.
If you've not seen it already, some ideas - https://next.nutanix.com/homelab-setup-42
A good walk-through- https://dreadysblog.com/2017/06/29/building-a-hci-lab-with-nutanix-community-edition/ My only additional note here is to try having a minimum of 64GB physical memory if running a single node. CVM and PC will chew up 32GB alone.
The Intel NUCs are pretty popular for size and power consumption but can be pricey upfront when loaded up with decent specs.

Going to drop this here for later (free account required to access)- https://next.nutanix.com/discussion-forum-14/download-nutanix-ce-docs-and-guides-3188

relapse808 said:
You said moving from ESX to AHV is easy? I have moved about 500 machines myself and using Nutanix Move is pretty easy to do as long as your machines are Server 2008R2 or above or RHEL 6.3 and above. If you have servers or systems below that it can become a real pain in the ass with the disk only migrations and manual install of legacy VirtIO.

I said "FAIRLY" easy

This is IT where there is no simple Yes or No answer. Everything is "it depends", "yes, but..." or "no, but..."

bman212121 said:
I'd hope your machines are Server 2012 and above otherwise you shouldn't be bothering to move them. I get client side issues and keeping XP VMs around because they are air gapped, but usually if you need server OS it's because you're trying to host something for other systems to connect to.

But I'd definitely like to hear more followups about Nutanix. I'm not so much concerned about their own appliances but basically their software. Is anyone using the Prism software? Running Prism for management but then AHV, ESXi, or HyperV underneath it?

Anything specific? Similar to relapse808, The last company I worked for was a customer of Nutanix, ~800 nodes including vdi all running ESXi over Prism. I got a 3 node AHV cluster stood up for POC but that was a lost cause and hadn't been touched.

bman212121 said:
I wouldn't say that's rusty at all. If anything you probably have a leg up on a lot of people. The general trend I'm seeing is less and less on prem hardware, and more hosted. If you were to change jobs I'd guess they would be more inclined to hire someone who has cloud migration and maintenance experience than someone who has physical on prem experience. The physical on prem is useful if you want to work in a colo, but the average company probably less so. It kind of makes me a bit sad though because I do prefer working with physical hardware over software, but a typical job seems to require that less and less.

I agree, we had a huge push for cloud "transformation" and while they were going nuts building in the cloud, there were still many things either not cloud ready, or they were dragging their feet, we still had significant on-prem purchases. Also, if not done right, cloud is crazy expensive when compared to on-prem, at least from the information I gathered.

relapse808 · Mar 20, 2020

We are working now to build a cluster that will be AOS with ESX for the hypervisor. We need it to run some virtual appliances and Nutanix is not supported very well in this area at all.

bman212121 · Mar 20, 2020

Outlaw85 said:
Anything specific? Similar to relapse808, The last company I worked for was a customer of Nutanix, ~800 nodes including vdi all running ESXi over Prism. I got a 3 node AHV cluster stood up for POC but that was a lost cause and hadn't been touched.

So basically AHV was junk of junk then? Guessing limitations on what it could do. For my purposes I'm not really using hardly any bells or whistles, so it would be interesting if it was just a must have feature or if it was unreliable or difficult to work with.

I don't know if you'd have a comparison but I generally hate VMWare VCenter with a passion. Whomever is in charge of it can't seem to figure out how to make it not break over every little thing (partly because they chopped up the disks so small if you do something like click download support bundle it might conveniently leave a 2GB file on the 10GB log volume and cause it to run out of space) Upgrades are kind of a pain because you have to go through a separate portal to upgrade it, and there's all kinds of random gotchas in every upgrade. One example is that someone seemed to think it was a good idea to require DNS RRs in order for the upgrade to work. But they weren't smart enough to actually make that a pre-upgrade check. So 45 minutes into the upgrade, it will completely bomb out, and per their configuration you cannot just resume after you fix the problem. You literally have to blow away the upgraded VM and start all over again.

For Prism I guess mainly just general usability:

?
HTML5 web interface
Quick to load
Easy to update through
simple to put hosts into maintenance mode without having to turn off alarms in a hundred places otherwise everyone gets alerts
Does prism do magic to allow ESXi to adhoc share storage space using direct attached storage (I'd imagine you're probably using a large SAN anyway)
Do you have random bugs that you need to fix all the time
Have you done upgrades to Prism, how easy is it to do
Connections to host console, is it secure, does it support copy / paste
Updating SSL certs, easy to do or is it not intuitive
stability / uptime. Can you run multiple instances of the host
Logging, easy to find things
How often do you end up on the phone with support for things you can't solve yourself

Just thinking of all of the random things that are day to day maintenance, because generally Prism is going to be back end only, so no one outside of your team is going to work with it. I know one of the demos I sat in they were touting heavily that you didn't need a monolithic SAN for their software to operate, but I can't recall if that was only if you used their hypervisor. These days as long as you could keep the redundancy stuffing a handful of SSDs into a host is a lot more cost effective than buying an entire SAN just so you can put all of those drives in one place.

relapse808 · Mar 20, 2020

LCM management in PC and PE are super easy to do thats for sure. Setting up a cluster from scratch is also pretty easy as long as you are on a flat network and use portable Foundation. The big issue for me was there are no custom templates for VM's like VMware. I wrote a custom GUI powershell script though that can deploy Windows VM with ease. I have been on support quite a bit. Due to SuperMicro being crap(4 nodes replaced already in 14 months) and other issues we have had. I haven't seen a log of bugs but have run into many software vendors that refuse to support it.

MrGuvernment · Mar 20, 2020

bman212121 said:
So basically AHV was junk of junk then? Guessing limitations on what it could do. For my purposes I'm not really using hardly any bells or whistles, so it would be interesting if it was just a must have feature or if it was unreliable or difficult to work with.

I don't know if you'd have a comparison but I generally hate VMWare VCenter with a passion. Whomever is in charge of it can't seem to figure out how to make it not break over every little thing (partly because they chopped up the disks so small if you do something like click download support bundle it might conveniently leave a 2GB file on the 10GB log volume and cause it to run out of space) Upgrades are kind of a pain because you have to go through a separate portal to upgrade it, and there's all kinds of random gotchas in every upgrade. One example is that someone seemed to think it was a good idea to require DNS RRs in order for the upgrade to work. But they weren't smart enough to actually make that a pre-upgrade check. So 45 minutes into the upgrade, it will completely bomb out, and per their configuration you cannot just resume after you fix the problem. You literally have to blow away the upgraded VM and start all over again.

This is why you always take a snapshot first of your vCenter before upgrading...Upgrading 101 - snapshot first just in case...

bman212121 · Mar 20, 2020

relapse808 said:
I haven't seen a log of bugs but have run into many software vendors that refuse to support it.

That's a good point I hadn't thought about too much. You can likely important a OVA/OVF, VMDK, etc into AHV, but the vendor will likely still blame everyone on your hypervisor if anything is wrong. I generally don't talk to a lot of support because it's easier for me to just figure it out than playing the back and forth game until they fix it.

bman212121 · Mar 20, 2020

MrGuvernment said:
This is why you always take a snapshot first of your vCenter before upgrading...Upgrading 101 - snapshot first just in case...

Of course you have a snapshot, or in the case of Vcenter you should have an entire new VM because in my experience it always requires you to make a new VM and then it copies the data to the new VM. So it's just delete the new VM it created, and then start the entire process again. The bad part is that generally I have the VM for it on one of the nodes it's managing because I'm not going to run another host just for the sake of handling Vcenter, so you need to make sure you know which node it's on before touching anything otherwise if the upgrade fails you get to start searching through nodes till you find the one that was running it and has the old powered off VM. I've upgrade a heck of a lot of software over the years and I've never seen anything crash and burn like VMware does.

Outlaw85 · Mar 20, 2020

bman212121 said:
So basically AHV was junk of junk then? Guessing limitations on what it could do. For my purposes I'm not really using hardly any bells or whistles, so it would be interesting if it was just a must have feature or if it was unreliable or difficult to work with.

I don't know if you'd have a comparison but I generally hate VMWare VCenter with a passion. Whomever is in charge of it can't seem to figure out how to make it not break over every little thing (partly because they chopped up the disks so small if you do something like click download support bundle it might conveniently leave a 2GB file on the 10GB log volume and cause it to run out of space) Upgrades are kind of a pain because you have to go through a separate portal to upgrade it, and there's all kinds of random gotchas in every upgrade. One example is that someone seemed to think it was a good idea to require DNS RRs in order for the upgrade to work. But they weren't smart enough to actually make that a pre-upgrade check. So 45 minutes into the upgrade, it will completely bomb out, and per their configuration you cannot just resume after you fix the problem. You literally have to blow away the upgraded VM and start all over again.

For Prism I guess mainly just general usability:

?
HTML5 web interface
Quick to load
Easy to update through
simple to put hosts into maintenance mode without having to turn off alarms in a hundred places otherwise everyone gets alerts
Does prism do magic to allow ESXi to adhoc share storage space using direct attached storage (I'd imagine you're probably using a large SAN anyway)
Do you have random bugs that you need to fix all the time
Have you done upgrades to Prism, how easy is it to do
Connections to host console, is it secure, does it support copy / paste
Updating SSL certs, easy to do or is it not intuitive
stability / uptime. Can you run multiple instances of the host
Logging, easy to find things
How often do you end up on the phone with support for things you can't solve yourself

Just thinking of all of the random things that are day to day maintenance, because generally Prism is going to be back end only, so no one outside of your team is going to work with it. I know one of the demos I sat in they were touting heavily that you didn't need a monolithic SAN for their software to operate, but I can't recall if that was only if you used their hypervisor. These days as long as you could keep the redundancy stuffing a handful of SSDs into a host is a lot more cost effective than buying an entire SAN just so you can put all of those drives in one place.

Personally, I think it's easier than VMware. There are still "gotcha's" like anything else though.
By comparison and to a lot of people, Nutanix is still behind in their features and tweaks that can be done. I've repeated heard from Nutanix "Keep it simple" so I'm not sure if they will ever push out something that has all the same settings/tweaks you can make in VMware.

HTML5 web interface - yes

Quick to load - up to perspective but to me, yes

Easy to update through - yes. As relapse stated, there is an LCM manager for things like BIOS and firmwares, and then a separate update panel for the software like AOS, hypervisor, foundation, NCC

simple to put hosts into maintenance mode without having to turn off alarms in a hundred places otherwise everyone gets alerts - Ehh. It depends. I know we brought this up to them before but as far as I know, it's an 'all or nothing' alert for now. Depending if you are managing alerts through PC (Prism Central) or PE (Prism Element), you would have to turn off alerts for all if at the PC level. If managed at PE, you can turn off cluster alerting. And if you were like my last place and used Netcool, you could "maintenance mode" the alerts for individual items as a work around.

Does prism do magic to allow ESXi to adhoc share storage space using direct attached storage (I'd imagine you're probably using a large SAN anyway) - You can add storage the way you normally would through ESXi as a traditional 3tier. In Prism, it's all managed as containers that are viewed like datastores in vmware. Does this answer the question?

Do you have random bugs that you need to fix all the time - I wouldn't go as far to say all the time, but they do happen and the "fix" is to upgrade to the latest version.

Have you done upgrades to Prism, how easy is it to do - It depends lol. If you are following the way Nutanix designed it, yes, it's easy. If you are like my last employer, you shoot yourself in the foot with your own gun and complain Nutanix didn't tell you it was loaded. That being said, there are some things I'd like to see added like the ability to pause an upgrade/LCM update for Change management processes. The current workaround is to only select a few nodes at a time to upgrade but requires babysitting to start the next set of nodes.

Connections to host console, is it secure, does it support copy / paste - I can't speak to security. It's not my area and I don't want to give false information. They do work with gov business and from that is where some the security stuff Nutanix uses/offers came from. Using CVM console, is easy through putty and does support copy/paste. I've personally not done a lot through IPMI console but I don't think you can use copy/paste. You can use putty to connect to the CVM and then connect to the host.

Updating SSL certs, easy to do or is it not intuitive - I'm not fluent in SSL certs but I would say it's easy. From PE: Settings > SSL Certificate > Wizard to replace (Regenerate Self Signed Certificate or Import Key and Certificate).

stability / uptime. Can you run multiple instances of the host - Maybe I don't fully understand the ask here. Stability and uptime will both increase with additional hardware in a cluster. Any single node is a single point of failure.

Logging, easy to find things - High level, yes, there are pages in PE/PC for Tasks and Events and it's easy to run NCC health checks for support (I recommend attaching NCC check for any technical support cases)

How often do you end up on the phone with support for things you can't solve yourself - For several things above and there are things that can only be done through command line. I would say we had someone on the phone with support consistently, majority of it was for hardware failures (dimm and disk/satadom most common). Deep tech support wasn't too often and when it was, was probably our fault for how we got to that point. If you are a larger customer, you will also get a CSM, slack support channel and some leverage for escalations which helps. We had access to an SRE that was awesome and either fixed whatever we broke, or knew what to do to address it.

I'm happy to share any screenshots from my CE install too. There are things I can't do because of my environment being so small and not having any domains/ad setup.

bman212121 · Mar 20, 2020

Outlaw85 said:
Personally, I think it's easier than VMware. There are still "gotcha's" like anything else though.
By comparison and to a lot of people, Nutanix is still behind in their features and tweaks that can be done. I've repeated heard from Nutanix "Keep it simple" so I'm not sure if they will ever push out something that has all the same settings/tweaks you can make in VMware.

simple to put hosts into maintenance mode without having to turn off alarms in a hundred places otherwise everyone gets alerts - Ehh. It depends. I know we brought this up to them before but as far as I know, it's an 'all or nothing' alert for now. Depending if you are managing alerts through PC (Prism Central) or PE (Prism Element), you would have to turn off alerts for all if at the PC level. If managed at PE, you can turn off cluster alerting. And if you were like my last place and used Netcool, you could "maintenance mode" the alerts for individual items as a work around.
Turning them all of in one place is fine for me. I'm not at the scale you are, so it's preferred to just turn off all alerting because you're already looking at the console anyway so you'd see if something broke while you were working on something else.

Does prism do magic to allow ESXi to adhoc share storage space using direct attached storage (I'd imagine you're probably using a large SAN anyway) - You can add storage the way you normally would through ESXi as a traditional 3tier. In Prism, it's all managed as containers that are viewed like datastores in vmware. Does this answer the question?
Sorry I probably didn't explain this well enough. So in VMWare you can map multiple nodes in a cluster to a SAN, as long as all of them can connect to the same SAN. If you put a VM onto storage that's inside of host A, then only host A can access it, and you can't use compute from host B to run that VM. So it was explained to me they can basically mirror the VMs storage between two hosts so that if something happens to host A host B can start that VM back up should host A go offline. So there would be no need for a central SAN, if you can meet your disk resource needs from the hosts. With VMware if you want to cluster resources, you cannot load the vm onto a local host disk if you want it to be able to vMotion a vm between hosts. The storage can't move for DRS afaik. I actually think now you could get away with a hot move of storage and compute between two hosts, but I can't say as I've tried it because it's already on a SAN since that was always a requirement.

Do you have random bugs that you need to fix all the time - I wouldn't go as far to say all the time, but they do happen and the "fix" is to upgrade to the latest version.
As long as upgrading actually fixes it, then that's not a bad thing. It's when you get told to upgrade and you still have issues.

Connections to host console, is it secure, does it support copy / paste - I can't speak to security. It's not my area and I don't want to give false information. They do work with gov business and from that is where some the security stuff Nutanix uses/offers came from. Using CVM console, is easy through putty and does support copy/paste. I've personally not done a lot through IPMI console but I don't think you can use copy/paste. You can use putty to connect to the CVM and then connect to the host.
Yea this one didn't occur to me that I specifically meant hosts with GUIs. Which you may not have too many of in your environment. Most of the time you're just in the console until you can get your remote solution working, but I've found that copy / paste tends to be more important to have for recovering say Server 2016 versus Ubuntu. Sounds like Ubuntu but Windows would be a no go. That's basically the same I've seen for most hypervisors, so not really better or worse.

Updating SSL certs, easy to do or is it not intuitive - I'm not fluent in SSL certs but I would say it's easy. From PE: Settings > SSL Certificate > Wizard to replace (Regenerate Self Signed Certificate or Import Key and Certificate).
That sounds fair enough. That's about how easy it should be. VMware overcomplicates this and their official directions have you dropping to a shell, running a bunch of commands. It's doable but certainly not straight forward. It's kind of worse to try to do in the GUI because of the way they tie all of the certs together.

stability / uptime. Can you run multiple instances of the host - Maybe I don't fully understand the ask here. Stability and uptime will both increase with additional hardware in a cluster. Any single node is a single point of failure.
So yes a node is a point of failure, but do you have issues with nodes dropping out a lot? I've definitely seen the Pink screen of death on VMware a few times for ESXi, but overall I haven't seen the Vcenter crash that I can recall. So multiple instances meant can you run redundant VMs of Prism Central and Prism Element. Or is the front end one point of failure?

MrGuvernment · Mar 20, 2020

bman212121 said:
Of course you have a snapshot, or in the case of Vcenter you should have an entire new VM because in my experience it always requires you to make a new VM and then it copies the data to the new VM. So it's just delete the new VM it created, and then start the entire process again. The bad part is that generally I have the VM for it on one of the nodes it's managing because I'm not going to run another host just for the sake of handling Vcenter, so you need to make sure you know which node it's on before touching anything otherwise if the upgrade fails you get to start searching through nodes till you find the one that was running it and has the old powered off VM. I've upgrade a heck of a lot of software over the years and I've never seen anything crash and burn like VMware does.

bman212121
If your doing an upgrade via ISO image yes mounted on a system and running through the wizard, it will create a new VM and copy over the data you like, that is the option it gives you. You can however also upgrade via ISO by mounting it to the vCenter VM, and using the https://vcenter:5480 - Update / Check Updates / Check CD ROM + URL

And now it will do an inplace upgrade from the ISO mounted.

Done my share of upgrades as well, and had a vCenter go from 5.5 to now on 6.7 U3c = still working perfectly managing about $5m in hardware, I will agree, like any software though, when it works it is awesome! When it breaks = it REALLY breaks! Nuke it from orbit and start from scratch or hope you got a backup./

Outlaw85 · Mar 21, 2020

bman212121 said:
I've definitely seen the Pink screen of death on VMware a few times for ESXi, but overall I haven't seen the Vcenter crash that I can recall. So mu

simple to put hosts into maintenance mode without having to turn off alarms in a hundred places otherwise everyone gets alerts - Ehh. It depends. I know we brought this up to them before but as far as I know, it's an 'all or nothing' alert for now. Depending if you are managing alerts through PC (Prism Central) or PE (Prism Element), you would have to turn off alerts for all if at the PC level. If managed at PE, you can turn off cluster alerting. And if you were like my last place and used Netcool, you could "maintenance mode" the alerts for individual items as a work around.
Turning them all of in one place is fine for me. I'm not at the scale you are, so it's preferred to just turn off all alerting because you're already looking at the console anyway so you'd see if something broke while you were working on something else.
Even if not at large scale, I would recommend having the discussions if there's hardware refreshes or possibly expanding in the future. It can quickly get annoying. And if it's not you doing the work, having to rely on someone else to turn the alerting back on may be a risk.

Does prism do magic to allow ESXi to adhoc share storage space using direct attached storage (I'd imagine you're probably using a large SAN anyway) - You can add storage the way you normally would through ESXi as a traditional 3tier. In Prism, it's all managed as containers that are viewed like datastores in vmware. Does this answer the question?
Sorry I probably didn't explain this well enough. So in VMWare you can map multiple nodes in a cluster to a SAN, as long as all of them can connect to the same SAN. If you put a VM onto storage that's inside of host A, then only host A can access it, and you can't use compute from host B to run that VM. So it was explained to me they can basically mirror the VMs storage between two hosts so that if something happens to host A host B can start that VM back up should host A go offline. So there would be no need for a central SAN, if you can meet your disk resource needs from the hosts. With VMware if you want to cluster resources, you cannot load the vm onto a local host disk if you want it to be able to vMotion a vm between hosts. The storage can't move for DRS afaik. I actually think now you could get away with a hot move of storage and compute between two hosts, but I can't say as I've tried it because it's already on a SAN since that was always a requirement.
OK, I think I got it. A few things going on here The non-shared local storage should not be used (shows up in ESXi), if it is, it's the same issue as vmware. The non-shared local storage usually doesn't have much capacity anyways as it's usually a satadom or m.2 drive/s. Yes, each of the hosts are loaded with disks, ideally the same capacity which is pooled together. This could be viewed as a "SAN" for discussion comparison. Provided you have enough available capacity, you can lose 1 or 2 nodes at a time depending on the Replication Factor (RF2=1node failure tolerated, RF3=2node failure tolerated). Each of the CVMs also works as a storage controller for the host. If you only lose the controller VM, you will lose that nodes storage but the secondary copies would provide the storage needs until the CVM is restored. In this scenario, you can still run VMs from the node but you storage will be pulled over the network.
--The mirror you are talking about in VMware cluster would be like vsan? or 3rd party vendor like Stormagic to get HCI functionality across a cluster.
-----On Nutanix, it's built into Prism that when you deploy a cluster, you mount all the storage into a single pool to be managed through containers. The replication of the data is done automatically between nodes, blocks, or racks depending on requirements and configuration.

Do you have random bugs that you need to fix all the time - I wouldn't go as far to say all the time, but they do happen and the "fix" is to upgrade to the latest version.
As long as upgrading actually fixes it, then that's not a bad thing. It's when you get told to upgrade and you still have issues.
They have generally been pretty good about troubleshooting, discussing the bug and why/what the upgrade will address since recommending the upgrade comes with some customer push back due to required maintenance modes for the cluster.

Connections to host console, is it secure, does it support copy / paste - I can't speak to security. It's not my area and I don't want to give false information. They do work with gov business and from that is where some the security stuff Nutanix uses/offers came from. Using CVM console, is easy through putty and does support copy/paste. I've personally not done a lot through IPMI console but I don't think you can use copy/paste. You can use putty to connect to the CVM and then connect to the host.
Yea this one didn't occur to me that I specifically meant hosts with GUIs. Which you may not have too many of in your environment. Most of the time you're just in the console until you can get your remote solution working, but I've found that copy / paste tends to be more important to have for recovering say Server 2016 versus Ubuntu. Sounds like Ubuntu but Windows would be a no go. That's basically the same I've seen for most hypervisors, so not really better or worse.
The only GUI would be the IPMI, IDRAC, ILO...etc which if you are connected via the flat switch or it's networked in, you can copy/paste. For the configuration and the portable foundation or foundation VM, you can copy/paste the IP info into the wizards. Once they are configured, there is little reason to actually go into the hosts unless something is really going sideways. At that point you are on the phone/remote with support and they are driving troubleshooting.

Updating SSL certs, easy to do or is it not intuitive - I'm not fluent in SSL certs but I would say it's easy. From PE: Settings > SSL Certificate > Wizard to replace (Regenerate Self Signed Certificate or Import Key and Certificate).
That sounds fair enough. That's about how easy it should be. VMware overcomplicates this and their official directions have you dropping to a shell, running a bunch of commands. It's doable but certainly not straight forward. It's kind of worse to try to do in the GUI because of the way they tie all of the certs together.
I agree with this. I remember having to do it for some product of VMware when I first joined the team and it was a pita.

stability / uptime. Can you run multiple instances of the host - Maybe I don't fully understand the ask here. Stability and uptime will both increase with additional hardware in a cluster. Any single node is a single point of failure.
So yes a node is a point of failure, but do you have issues with nodes dropping out a lot? I've definitely seen the Pink screen of death on VMware a few times for ESXi, but overall I haven't seen the Vcenter crash that I can recall. So multiple instances meant can you run redundant VMs of Prism Central and Prism Element. Or is the front end one point of failure?
The only problems related to physical nodes dropping out were for hard dimm failures that caused PSODs or if the satadom (G4s and earlier) failed during upgrades/foundation. I feel confident saying it was not a regular occurrence, even with the number of nodes running. As for PC, you have 2 options - 1 PCVM or 3 PCVMs for HA support but they have to be in the same cluster. As long as the cluster is running, PE should be available as each CVM is effectively running PE with an assigned VIP. You can use the VIP or the CVM IP to get to Prism Element. If you loose more nodes at the same time than the replication factor can handle, the entire cluster will go offline. I will say, in my experiences, Nutanix has actually been pretty resilient to recover after going down hard. Same as the comment above, you are already on the phone with support at this point and they would likely be driving for post health checks if it got really bad.

If we want to continue the discussion, we may want to change the formatting. lol

bman212121 · Mar 21, 2020

No I think you basically hit everything that I was kind of wondering about. I'm certainly intrigued and it does sound like it could be workable in my environment. I appreciate you taking the time to answer the questions!

I think anything else I'd want to know I'll probably just throw together a test environment, and I'm sure they would be more than happy to do a PoC. You just know that if you do a PoC your phone and email isn't going to hear the end of it for the next 6 months though!

Outlaw85 · Mar 21, 2020

Awesome, glad this thread was able to provide some value

They do have the demo you can do to test drive Prism but I understand the need to get hands-on.. Nothing compares

lol
And hopefully the sales guys aren't that bad but if you do get a "free" POC gear, they may be in your ear regularly. lol What state/region are you in? Maybe I can kick them for some info if you want. I know our sales/account team for the area were cool. We are considered north central- WI, IL, MN (prob a few others I'm forgetting) for an idea. Feel free to send me a PM too if you want.

relapse808 · Mar 25, 2020

So i just uncovered a huge security flaw when building custom windows VM's using a custom script. When you do this and I am using a powershell script I wrote it actually takes the Unattend file and mounts it to a IDE based CD-ROM. I personally am having the machine join to the domain and that requires credentials. Well it turns out Nutanix leaves the CD-ROM in place after the VM is built and a copy of the unattend file there as well. This means anyone who logs in can go and get the credentials that were used to join domain in plain text.

My script has a custom XML writer that injects input from the user into the XML file locally and then saves as an array and removes all data from the local copy. Now that I found this bug which is now in Nutanix engineering I have modified my script to wait about 3 minutes after the VM is powered on and then ejects the CD-ROM drive. Since it is IDE you cannot delete this CD-ROM without powering off the VM.

Outlaw85 · Jul 29, 2020

Just giving a bump. It looks like I didn't include the PUBLIC Discord channel for those interested. Emphasis on the PUBLIC if you ask questions related to business.

https://discord.gg/XW8XgP

relapse808 · Jul 29, 2020

I just got done building two more clusters with vmware as the hypervisor. It was a lot of fun and personally it works much better than AHV. I will say AHV is way easier to setup.

Outlaw85 · Jul 31, 2020

relapse808 said:
I just got done building two more clusters with vmware as the hypervisor. It was a lot of fun and personally it works much better than AHV. I will say AHV is way easier to setup.

How so? Are you talking about the foundation process which deploys the hypervisor with it?
VMWare works much better than AHV? Because of feature we grew to love? Other?

Genuinely interested on your perspective.

Nutanix CE? What are you running?

[H]ard|Gawd

[H]ard|Gawd

2[H]4U

[H]ard|Gawd

[H]ard|Gawd

[H]ard|Gawd

Supreme [H]ardness

[H]ard|Gawd

[H]ard|Gawd

[H]ard|Gawd

Fully [H]

[H]ard|Gawd

Supreme [H]ardness

[H]ard|Gawd

[H]ard|Gawd

[H]ard|Gawd

[H]ard|Gawd

[H]ard|Gawd

[H]ard|Gawd

[H]ard|Gawd

Fully [H]

[H]ard|Gawd

[H]ard|Gawd

[H]ard|Gawd

[H]ard|Gawd

Fully [H]

[H]ard|Gawd

[H]ard|Gawd

[H]ard|Gawd

[H]ard|Gawd

[H]ard|Gawd

[H]ard|Gawd

[H]ard|Gawd