Does vsphere support any kind of hardware watchdog?

Discussion in 'Virtualized Computing' started by danswartz, Dec 30, 2017.

  1. danswartz

    danswartz 2[H]4U

    Messages:
    3,584
    Joined:
    Feb 25, 2011
    Running a 6.5 host, Build 6765664. Running fine since update to this build on 12/10. This morning around 10AM, it became unresponsive. I was out of town. so I was unable to check anything until just now. The IPMI console showed everything apparently okay, except for the host being unresponsive (not even to pings.) I rebooted it, and it came up fine, but I'd kinda like to avoid hangs in the future. I've been searching via google, but I don't see any kind of hardware watchdog support. Is there such a thing? It's a single host, so HA won't help me here. Thanks!
     
  2. k1pp3r

    k1pp3r [H]ardness Supreme

    Messages:
    7,650
    Joined:
    Jun 16, 2004
    There are for some IPMI and iDRAC and iLO but its typically a vendor specific ISO you have to install.

    That said, your issue is likely not hardware. Do a fresh install of ESXi on the host.
     
  3. danswartz

    danswartz 2[H]4U

    Messages:
    3,584
    Joined:
    Feb 25, 2011
    This is a whitebox in my home lab, so there is no vendor involved. I'm not saying this is a HW issue - I'd just like insurance that it won't lock up again when no-one is here to push the reset button :)
     
  4. REDYOUCH

    REDYOUCH [H]ardness Supreme

    Messages:
    4,541
    Joined:
    Mar 17, 2001
    This is bad advice. You have no idea what the issue is until you review the logs and perform some level of diagnostics.
     
  5. k1pp3r

    k1pp3r [H]ardness Supreme

    Messages:
    7,650
    Joined:
    Jun 16, 2004
    Its not bad advice if your alerting is setup correctly.

    However OP's issue was the hypervisor locking up, unless you have outside monitoring form that host, its not going to alert you in any fashion.
     
  6. REDYOUCH

    REDYOUCH [H]ardness Supreme

    Messages:
    4,541
    Joined:
    Mar 17, 2001
    His host locked-up one time and you're telling him to re-install ESXi. He should at least spend a few minutes taking a look at the server logs to see if he can find out what occurred.
     
  7. k1pp3r

    k1pp3r [H]ardness Supreme

    Messages:
    7,650
    Joined:
    Jun 16, 2004
    I'm assuming troubleshooting has already been done, besides, reinstalling esxi can easily be done without affecting the VMFS volumes. Then you just re-import your machines and move one.
     
  8. danswartz

    danswartz 2[H]4U

    Messages:
    3,584
    Joined:
    Feb 25, 2011
    I'm not sure which logs to look at. I noticed a new build was available, so since I was suspicious something might have gotten corrupted, I installed that build. In case this happens again, where do you suggest looking? e.g. which logfiles? Thanks!
     
  9. lopoetve

    lopoetve Imhotep

    Messages:
    29,184
    Joined:
    Oct 11, 2001
    /var/log/vmkernel.log. If you're stored on reliable storage, that will still be there. See what hte last few messages were. /var/log/vmkwarning.log is a warn/error version of the same log.
     
  10. danswartz

    danswartz 2[H]4U

    Messages:
    3,584
    Joined:
    Feb 25, 2011
    Thanks! Vsphere is installed on a small (8GB) DOM, so the logs should still be available. Hasn't happened again, so far (fingers crossed...)
     
  11. lopoetve

    lopoetve Imhotep

    Messages:
    29,184
    Joined:
    Oct 11, 2001
    Eh, may be big enough and stable enough for it to have tagged it as stable.
     
  12. danswartz

    danswartz 2[H]4U

    Messages:
    3,584
    Joined:
    Feb 25, 2011
    Turns out it wasn't, so that's why I saw nothing useful in the logs :( I changed the syslog global settings to stash the logs on one of the NFS datastores.
     
  13. ChRoNo16

    ChRoNo16 [H]ard|Gawd

    Messages:
    1,207
    Joined:
    Feb 3, 2011
    I had sol many bugs and errors with ESXi 6.5 that I went back down to 5.5. Less support for newer stuff but its at least stable.