The host error hook is not working complaining about fencing when hypervisor goes down.
Versions of the related components and OS (frontend, hypervisors, VMs):
opennebula 5.4.13
centos7
Steps to reproduce:
- uncomment host kook in oned.conf on all HA frontend nodes
HOST_HOOK = [
NAME = “error”,
ON = “ERROR”,
COMMAND = “ft/host_error.rb”,
ARGUMENTS = “$ID -m -p 0”,
REMOTE = “no” ] - restart oned on all HA frontend nodes
- shutdown hypervisor with VM on it
Current results:
oned.log:
Wed Jul 11 18:46:07 2018 [Z0][InM][I]: Command execution fail: ‘if [ -x “/var/tmp/one/im/run_probes” ]; then /var/tmp/one/im/run_probes kvm /var/lib/one//datastores 4124 20 2 ord-virt-004; else exit 42; fi’
Wed Jul 11 18:46:07 2018 [Z0][InM][I]: ssh: connect to host ord-virt-004 port 22: Invalid argument
Wed Jul 11 18:46:07 2018 [Z0][InM][I]: ExitCode: 255
Wed Jul 11 18:46:09 2018 [Z0][ReM][D]: Req:3520 UID:0 one.zone.raftstatus invoked
Wed Jul 11 18:46:09 2018 [Z0][ReM][D]: Req:3520 UID:0 one.zone.raftstatus result SUCCESS, “<SERVER_ID>0</…”
Wed Jul 11 18:46:09 2018 [Z0][ReM][D]: Req:9296 UID:0 one.vmpool.info invoked , -2, -1, -1, -1
Wed Jul 11 18:46:09 2018 [Z0][ReM][D]: Req:9296 UID:0 one.vmpool.info result SUCCESS, “<VM_POOL>13<…”
Wed Jul 11 18:46:09 2018 [Z0][ReM][D]: Req:9968 UID:0 one.vmpool.info invoked , -2, -1, -1, -1
Wed Jul 11 18:46:09 2018 [Z0][ReM][D]: Req:9968 UID:0 one.vmpool.info result SUCCESS, “<VM_POOL>13<…”
Wed Jul 11 18:46:11 2018 [Z0][InM][I]: Command execution fail: ‘if [ -x “/var/tmp/one/im/run_probes” ]; then /var/tmp/one/im/run_probes kvm /var/lib/one//datastores 4124 20 2 ord-virt-004; else exit 42; fi’
Wed Jul 11 18:46:11 2018 [Z0][InM][I]: ssh: connect to host ord-virt-004 port 22: Invalid argument
Wed Jul 11 18:46:11 2018 [Z0][InM][I]: ExitCode: 255
Wed Jul 11 18:46:14 2018 [Z0][InM][I]: Command execution fail: ‘if [ -x “/var/tmp/one/im/run_probes” ]; then /var/tmp/one/im/run_probes kvm /var/lib/one//datastores 4124 20 2 ord-virt-004; else exit 42; fi’
Wed Jul 11 18:46:14 2018 [Z0][InM][I]: ssh: connect to host ord-virt-004 port 22: Invalid argument
Wed Jul 11 18:46:14 2018 [Z0][InM][I]: ExitCode: 255
Wed Jul 11 18:46:18 2018 [Z0][InM][D]: Monitoring host ord-virt-003 (1)
Wed Jul 11 18:46:18 2018 [Z0][InM][I]: Command execution fail: ‘if [ -x “/var/tmp/one/im/run_probes” ]; then /var/tmp/one/im/run_probes kvm /var/lib/one//datastores 4124 20 2 ord-virt-004; else exit 42; fi’
Wed Jul 11 18:46:18 2018 [Z0][InM][I]: ssh: connect to host ord-virt-004 port 22: Invalid argument
Wed Jul 11 18:46:18 2018 [Z0][InM][I]: ExitCode: 255
Wed Jul 11 18:46:18 2018 [Z0][ONE][E]: Error monitoring Host ord-virt-004 (2): -
Wed Jul 11 18:46:19 2018 [Z0][ReM][D]: Req:8960 UID:0 one.system.config invoked
Wed Jul 11 18:46:19 2018 [Z0][ReM][D]: Req:8960 UID:0 one.system.config result SUCCESS, “<AUTH_MAD>…”
Wed Jul 11 18:46:19 2018 [Z0][ReM][D]: Req:6400 UID:0 one.host.info invoked , 2
Wed Jul 11 18:46:19 2018 [Z0][ReM][D]: Req:6400 UID:0 one.host.info result SUCCESS, “2<NAM…”
Wed Jul 11 18:46:19 2018 [Z0][HKM][D]: Message received: LOG I 2 Command execution fail: /var/lib/one/remotes//hooks/ft/host_error.rb 2 -m -p 0
Wed Jul 11 18:46:19 2018 [Z0][HKM][D]: Message received: LOG I 2 ExitCode: 255
Wed Jul 11 18:46:19 2018 [Z0][HKM][D]: Message received: EXECUTE FAILURE 2 error: -
host_error.log:
[2018-07-11 18:46:19 +0000][HOST 2][I] Hook launched
[2018-07-11 18:46:19 +0000][HOST 2][I] hostname: ord-virt-004
[2018-07-11 18:46:19 +0000][HOST 2][I] Fencing enabled
[2018-07-11 18:46:19 +0000][HOST 2][E]
[2018-07-11 18:46:19 +0000][HOST 2][E] Fencing error
[2018-07-11 18:46:19 +0000][HOST 2][E] Exiting due to previous error.
Expected results:
VM should be restarted on an available node