How to Enable Automatic Live Migration on Host Failure in OpenNebula?

Hi OpenNebula Team,

I am setting up opennebula 6.10 and live migration is working perfectly when triggered manually through the Sunstone interface.

I have also configured the necessary context, .etchosts, and SSH key exchange between hosts to support migration, also added SUPPORT_MIGRATION = YES attribute in hosts. but still support migration doesint work for me.

However, we want to automate this — specifically, we want VMs to automatically live migrate from Host A to Host B if Host A suddenly goes down (e.g., due to hardware failure).

Currently, if Host A goes down, the VM connection is lost, and we have to manually trigger the migration from Sunstone (which obviously isn’t possible if the host is unreachable).

Question:
What are the correct steps or configurations required to enable automatic live migration or HA failover in such scenarios?

Awaiting for the response Team.

Hi @Senthil_Kumar_M,

I’m not part of the OpenNebula team, but I have dealt with a similar situation in my own infrastructure.

To answer your question: in OpenNebula 6.10, it is not possible to perform automatic live migration of virtual machines (VMs) in case of a sudden hypervisor failure (e.g., hardware failure). Although manual live migration works perfectly via the Sunstone interface, automatic live migration upon failure is not supported.

What is possible, however, is an automatic restart of VMs on another hypervisor, but this is not live migration. This means that the VMs will be stopped on the failing Host A and then started on Host B, which results in a temporary service interruption.

I may be mistaken, but from what I recall, this is how it works.

If this restart scenario suits your needs, you can follow the detailed steps provided here.

I hope this answers your question.

Regards Pape !

Hi @Senthil_Kumar_M,

The OpenNebula configuration to do VM migrations on the event of a host error is documented here
You should read the entire section and pay special attention to the host fencing setup. It is crucial to ensure that the failed host is indeed down and that there are no running VMs via IPMI shutdown or other suitable means in your environment. In the case of a management network error, the VMs on the host are still running, so starting a VM in another host will lead to two VMs using the same disks— and with a shared storage it is the perfect recipe for disk data corruption.

Best Regards,
Anton Todorov

Hi @Pape and @atodorov_storpool

Thanks for the response, Yes i understood the critical things in this setup .also am setting up with above steps mentioned , will let you u know if i was strucked any where.

Best Regards,
Senthil Kumar M

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.