Windows 2012 R2 Failover Cluster with Virtual RDM - Cluster Across Boxes?

Discussion in 'Virtualized Computing' started by KapsZ28, Jun 20, 2017.

  1. KapsZ28

    KapsZ28 2[H]4U

    Messages:
    2,108
    Joined:
    May 29, 2009
    It seems difficult to find a solid answer to this question. VMs are running on vSphere 5.5 across multiple ESXi hosts. VMware documentation says both physical and virtual are supported in this configuration but physical is recommended. The same documentation also states, "Clusters across physical machines with non-pass-through RDM is supported only for clustering with Windows Server 2003. It is not supported for clustering with Windows Server 2008." But I am interested in 2012 R2.

    Main reason I am looking to use virtual RDM is to run native Veeam backups. I am not overly thrill having to use the Veeam agent backup due to limitations and relying on Microsoft's VSS writer.

    Anyone know what is officially supported by both VMware and Microsoft, and what works best especially when you want to run snapshot based backups?

    Also, the clustering is being used on file servers.
     
  2. k1pp3r

    k1pp3r [H]ardness Supreme

    Messages:
    7,323
    Joined:
    Jun 16, 2004
    Are you referring to a windows cluster with RDM for a shared disk?
     
  3. KapsZ28

    KapsZ28 2[H]4U

    Messages:
    2,108
    Joined:
    May 29, 2009
    Yes. I thought that was pretty obvious from what I wrote. ;)
     
  4. k1pp3r

    k1pp3r [H]ardness Supreme

    Messages:
    7,323
    Joined:
    Jun 16, 2004
  5. KapsZ28

    KapsZ28 2[H]4U

    Messages:
    2,108
    Joined:
    May 29, 2009
    Thanks for the article, but that seems to be the same as the 5.5 article I was reading. Virtual mode only for cluster in a box and physical mode for cluster across boxes. VMFS also only supports cluster in a box. Cluster in a box is only good for guest OS failures and maintenance. No protection against ESXi host failure. Kind of sucks. Another reason more storage vendors need to build in CIFS with AD authentication.
     
  6. lopoetve

    lopoetve Imhotep

    Messages:
    28,900
    Joined:
    Oct 11, 2001
    Physical only. Don't use virtual RDMs. :) Why would a physical mode RDM be a problem?
     
  7. KapsZ28

    KapsZ28 2[H]4U

    Messages:
    2,108
    Joined:
    May 29, 2009
    Backups... Have a customer with many physical RDMs and on more than one occasion the VSS writer could no longer take snapshots and therefore Veeam Agent backups would fail. With a virtual RDM I don't have to worry about Microsoft's VSS writer since VMware snapshots can be used with a typical Veeam backup job.

    They have all this clustering setup because there can't be any downtime. No downtime means I can't even run CHKDSK to fix the issue. I am curious if this is a Microsoft problem or something with the storage they use. Three LUNs within six months on all different server have experienced this problem and it always seems to happen when they fail over the LUNs during patching. All the servers have 3-4 physical RDMs each and only one volume is affected. So it seems less likely to be a OS VSS issue and more likely a corruption problem on the volume.
     
  8. lopoetve

    lopoetve Imhotep

    Messages:
    28,900
    Joined:
    Oct 11, 2001
    Snap shorting a PGR SCSI-3 reserved volume in use by two or more machines is a very bad idea. Same as snap shorting anything having to do with the OS involved in that.

    In other words, that already doesn't work like you really want it to (and you can't restore it), so don't bother. App based backups!

    What's the storage?
     
  9. KapsZ28

    KapsZ28 2[H]4U

    Messages:
    2,108
    Joined:
    May 29, 2009
    So although virtual RDMs support snapshots, it is not recommended with bus sharing? What app based backup would you recommend?

    I believe the physical storage is 3PAR, but they are using DataCore SANsymphony.
     
  10. lopoetve

    lopoetve Imhotep

    Messages:
    28,900
    Joined:
    Oct 11, 2001
    Depends on what the app is. If it's SQL (90% of the clusters), just use SQL backups to a share that veeam then does things with (or any other backup software).

    EXTREMELY not recommended with bus sharing - it honestly shouldn't work, but sometimes does by sheer luck. Remember, VMware snapshots are not coordinated between machines like you'd want when you're doing complex SCSI things with a shared volume; they're unaware that any of that is going on.

    Virtualizing storage like SANsymphony does always makes things interesting. If they're having corruption on physical mode RDMs, that means the software somewhere in there is mungling the the logical unit and/or the abstraction between the physical blocks and the logical blocks. I'd... well, not do that in the first place, personally, at least not at a block level. You do that at an object level or somewhere else, but not block... and not block when you're already doing abstraction there with VMs. But that's me. Either way, it's doing something ~wrong~ in there.

    And damn the autocorrect - snap shorting? lawl.
     
    danswartz likes this.
  11. KapsZ28

    KapsZ28 2[H]4U

    Messages:
    2,108
    Joined:
    May 29, 2009
    The VSS issue turned out to be a VMware issue with multi-pathing. They disabled multi-pathing and now it is working correctly.