I use Ceph for shared storage and cephfs for system store. I have not setup the hosts in a cluster.
Is there some way that I can setup a per VM auto recover function when a host is in failor ?
Is there som API call that I can call from my monitoring that gets trickered if a host is down together with a given VM to automaticly migrate + start the lost VM on a running host.
As @thomasalrin said, you can use a hook for that.
The fault tolerance hook included in oned.conf has a -m option to do exactly that: migrate VMs to a new host when a host reaches the failure state.
#*******************************************************************************
# Fault Tolerance Hooks
#*******************************************************************************
# This hook is used to perform recovery actions when a host fails.
# Script to implement host failure tolerance
# It can be set to
# -m migrate VMs to another host. Only for images in shared storage
# -r recreate VMs running in the host. State will be lost.
# -d delete VMs running in the host
# Additional flags
# -f force resubmission of suspended VMs
# -p <n> avoid resubmission if host comes
# back after n monitoring cycles
#*******************************************************************************
HOST_HOOK = [
name = "error",
on = "ERROR",
command = "ft/host_error.rb",
arguments = "$ID -m -p 5",
remote = "no" ]
#-------------------------------------------------------------------------------