What’s your automation setup for multi-node OpenNebula operations?

We’re managing a multi-site OpenNebula deployment and want to better automate VM provisioning, patching, and failover checks across nodes.

Curious what your tool stack looks like. Using native hooks/scripts? Ansible? Any experience integrating tools like Terraform or something more orchestration-focused?

Would love to learn from how others handle this.

Hello,

The best part of OpenNebula is that you have plenty of options for it, in my opinion (some of my colleagues may differ), when thinking about operations:

  • If you are used to work with terraform, the OpenNebula provider is the best way. Please check the docs at Terraform Registry for examples on its usage
  • The CLI allows everything to be done on a shell basis, but people differs about thinking if bash/zsh/mksh are or not programming languages. Thus, you have python, ruby, go and a XML RPC based API entry point (if your language has a libcurl and you can cope with XML it can be called)
  • About failover between checks, hooks that can be executed after certain API calls or in certain events. Please, note that some physical situations cannot be fixed (i.e., in case of a very sudden electric cut due to a fire, flooding, etc. the VM memory on the running host cannot be sent in time to another one)

We’ve built a hybrid setup that’s been working well for managing OpenNebula across nodes:

Ansible: for basic config mgmt (host setup, file sync, network settings)

Attune: really helpful for multi-step jobs like post-deploy validation, Ceph checks, snapshot cleanup, etc. You can run scripts across all nodes and resume if one fails

Terraform: for VM definitions + templating (especially for non-OpenNebula resources)

Bash + Cron: still using these for fast, non-critical routines

For monitoring: Prometheus + custom alerts via Alertmanager