VM HA not working after update to 5.2

Hi! Test VM HA on 5.0.2 - all works fine! After update to 5.2 - hook not work. VMs still in UNKNOWN state. Checked configs on nodes - all ok.

Hi Anton,

Please provide some more info. What is your HOST_ERROR hook configuration in /etc/oned.conf?

It is extremely dangerous(highly possible data corruption on the VM disks) to do failover when there is only false positive /or monitoring error/ and the VM is still working on the “old” host. So in 5.2 there is a fencing script expected to be configured and working by default. To have the 5.0.x behavior you must add ‘-u’ to disable the failover script.

In any way there should be a clues in the logs.

Kind Regards,
Anton Todorov

Hi! In /etc/one/oned.conf I have:

NAME = “error”,
COMMAND = “ft/host_error.rb”,
ARGUMENTS = “$ID -m -p 1”,
REMOTE = “no” ]

Have error in /var/log/one/host_error.log

[2016-10-25 23:00:39 +0300][HOST 5][I] Hook launched
[2016-10-25 23:00:39 +0300][HOST 5][I] hostname: node03
[2016-10-25 23:00:39 +0300][HOST 5][I] Wait 1 cycles.
[2016-10-25 23:00:39 +0300][HOST 5][I] Sleeping 60 seconds.
[2016-10-25 23:01:39 +0300][HOST 5][I] Fencing enabled
[2016-10-25 23:01:39 +0300][HOST 5][E]
[2016-10-25 23:01:39 +0300][HOST 5][E] Fencing error
[2016-10-25 23:01:39 +0300][HOST 5][E] Exiting due to previous error.

Don’t have other errors.



The script is working as it is designed, You can provide ‘-u’ to the arguments to have the “old” behavior, but I strongly recommend to implement a fencing.

should be changed to

The docs are lagging tough. For more details please check the comments inside /var/lib/one/remotes/hooks/ft/host_error.rb and fence_host.sh files.

Kind Regards,
Anton Todorov

I use fencing for cluster, but in Opennebula dancing not working, I try many times. Every time - fencing error!

Can you show the fencing_script that you are using?
It is ok to black-out any sensitive info as credentials/passwords…

Kind Regards,
Anton Todorov

# @param $1 the host information in base64
# @return 0 on success. Make sure this script does not return 0 if it fails.

# To enable remove this line
exit 1

# Get host parameters with XPATH

if [ -z "$ONE_LOCATION" ]; then

if [ ! -x "$XPATH" ]; then
    echo "XPATH not found: $XPATH"
    exit 1

XPATH="${XPATH} -b $1"


while IFS= read -r -d '' element; do
done < <($XPATH     /HOST/ID \
                    /HOST/NAME \
                    /HOST/TEMPLATE/FENCE_IP )


if [ -z "$FENCE_IP" ]; then
    echo "Fence ip not found"
    exit 1

# Fence

# Example:
# fence_ilo -a $FENCE_IP -l <username> -p <password>
/bin/ipmitool -I lanplus -H $FENCE_IP -U root -P mypassword chassis power off


You should comment out (or remove) this line :slight_smile:

Kind Regards,
Anton Todorov

No changes! :frowning:


If it is failing it is because of different reason…

Next step is to try adding some debugging lines to figure out what is going on. I prefer logging to syslog so before each key line like before ‘if’ clauses you can add something like:
logger -t fence_host “something meaningful”

Then you will know is the script called and where there are no issues. Then you can tale measures depending on what the output is.

Kind Regards,
Anton Todorov

Thank you for help but it’s not work. I will use old options.

Even after adding -u to HOST_HOOK arguments fencing error still comes. Why. How to make it work?