HA. VM stuck in BOOT_POWEROFF state

binbash · February 20, 2017, 11:47am

In case one of hosts goes down HA for VMs works incorrect in several specific cases.

For example:

when we stop one of hosts ONE waiting for several monitoring cycles befor use stonith script;
host goes to shutdown and already inaccessible from ONE but VM’s still in RUNNING;
in this step we’ve send POWEROFF to vm;
until host is down VMs in SHUTDOWN state;
when host starting VMs going to POWEROFF state;
then we try to start follow VMs we take:

Mon Feb 20 12:14:47 2017 [Z0][VM][I]: New LCM state is SHUTDOWN_POWEROFF
Mon Feb 20 12:21:00 2017 [Z0][LCM][I]: VM reported SHUTDOWN by the drivers
Mon Feb 20 12:21:00 2017 [Z0][VM][I]: New state is POWEROFF
Mon Feb 20 12:21:00 2017 [Z0][VM][I]: New LCM state is LCM_INIT
Mon Feb 20 12:22:22 2017 [Z0][VM][I]: New state is ACTIVE
Mon Feb 20 12:22:22 2017 [Z0][VM][I]: New LCM state is BOOT_POWEROFF
Mon Feb 20 12:22:22 2017 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/68/deployment.2

feldsam · February 20, 2017, 6:11pm

Hello, you can adjust in /etc/one/oned.conf monitoring driver settings

#-------------------------------------------------------------------------------
#  KVM UDP-push Information Driver Manager Configuration
#    -r number of retries when monitoring a host
#    -t number of threads, i.e. number of hosts monitored at the same time
#    -w Timeout in seconds to execute external commands (default unlimited)
#-------------------------------------------------------------------------------
IM_MAD = [
      NAME          = "kvm",
      SUNSTONE_NAME = "KVM",
      EXECUTABLE    = "one_im_ssh",
      ARGUMENTS     = "-r 3 -t 15 kvm" ]

“stonith” scrpt should auto migrate VMs to another host

https://docs.opennebula.org/5.2/advanced_components/ha/ftguide.html#host-failures

pianziva · December 17, 2018, 8:02am

hello,

i already enable the KVM UDP Push, but still getting state POWEROFF on the VM if i turned off one of the host.

Any suggestion or solution ?

feldsam · December 17, 2018, 8:49am

Hello,

when we stop one of hosts ONE waiting for several monitoring cycles befor use stonith script;

host goes to shutdown and already inaccessible from ONE but VM’s still in RUNNING;

Why you do this?

in this step we’ve send POWEROFF to vm;

until host is down VMs in SHUTDOWN state;

when host starting VMs going to POWEROFF state;

Are you implemented proper fencing mechanism to host error hook?

pianziva · December 17, 2018, 8:59am

hi Feldsam,

I was trying to setup a High availability for host failures.

https://docs.opennebula.org/5.2/advanced_components/ha/ftguide.html#host-failures

and when i test to enable fencing or disable from oned.conf file to get feature of HA of host failure, this vm wont migrate to other host and still getting up state of POWEROFF.

pianziva · December 17, 2018, 9:03am

Hi,

Fyi this is the schema our cloud design:

host1, front end HA
host2, node kvm
host3, node kvm

and for the shared storage i using glusterfs on each host node.

Topic		Replies	Views
Capture “VM running but monitor state is POWEROFF” Product Support	2	1279	October 4, 2016
What are the causes let OpenNebula "thinks" VM is "POWEROFF" and how to recover? Product Support	5	739	February 18, 2015
Monitoring falsely reports vm in "poff" state Product Support	3	590	June 2, 2015
ONE 5.0.1 - Autostart VM after controller host failure Product Support	0	954	July 17, 2016
VMs automatically goes into POWEROFF state Product Support	8	985	August 3, 2016

HA. VM stuck in BOOT_POWEROFF state

Related topics