We’re managing a multi-site OpenNebula deployment and want to better automate VM provisioning, patching, and failover checks across nodes.
Curious what your tool stack looks like. Using native hooks/scripts? Ansible? Any experience integrating tools like Terraform or something more orchestration-focused?
The best part of OpenNebula is that you have plenty of options for it, in my opinion (some of my colleagues may differ), when thinking about operations:
If you are used to work with terraform, the OpenNebula provider is the best way. Please check the docs at Terraform Registry for examples on its usage
The CLI allows everything to be done on a shell basis, but people differs about thinking if bash/zsh/mksh are or not programming languages. Thus, you have python, ruby, go and a XML RPC based API entry point (if your language has a libcurl and you can cope with XML it can be called)
About failover between checks, hooks that can be executed after certain API calls or in certain events. Please, note that some physical situations cannot be fixed (i.e., in case of a very sudden electric cut due to a fire, flooding, etc. the VM memory on the running host cannot be sent in time to another one)
Attune: really helpful for multi-step jobs like post-deploy validation, Ceph checks, snapshot cleanup, etc. You can run scripts across all nodes and resume if one fails
Terraform: for VM definitions + templating (especially for non-OpenNebula resources)
Bash + Cron: still using these for fast, non-critical routines
For monitoring: Prometheus + custom alerts via Alertmanager