OpenNebula best practices

Is there an ‘ideal’ situation or scenario diagram of how opennebula can be deployed in an HA fashion (2x sunstone controllers) + X * Nodes? Looking for a good graphical representation and best practices for providing datastores via glusterfs, NFS, SAN.

Thinking best approach is to provide the storage directly to the nodes?

Cheers,
TK

I am relatively new to OpenNebula, but I’ll try:

HA is described in the Advanced administration guide: http://docs.opennebula.org/4.14/advanced_administration/high_availability/index.html

For my setup, I have not decided yet whether it is worth the additional complexity. Should OpenNebula fail,
the physical hosts remain runing, as well as existing VMs on them. So for the services running on the private cloud servers, the outage of OpenNebula would not matter much.

I have tried GlusterFS and CEPH, and for OpenNebula I decided to use CEPH instead of GlusterFS. I have a CEPH RBD pool as a main datastore for VM images, and a non-shared system datastore with TM_MAD=ssh. I am also considering CEPH-FS for the system datastore, but so far I am OK with the current setup.

FWIW, my nodes have two physical disks per node, there is a 20 GB RAID-1 volume (linux md-based) for the root filesystem, 4 GB RAID-1 volume for swap, and the rest of both disks is used as non-RAID partitions with XFS for two separate CEPH OSDs (my OpenNebula nodes are also CEPH nodes).

I use MySQL backend remotely connecting to the MySQL server which runs other databases outside OpenNebula as well.

Hope this helps.

Okay, here is the first drawback of the abovementioned approach: the 20 GB root volume per node is insufficient for the node Sunstone runs on, and for the “bridge” nodes where images are converted to CEPH. The problem is that when the image is uploaded via Sunstone, it is stored in /var/tmp on the node where Sunstone is running, then it is copied to one of the CEPH bridge nodes (possibly the same node Sunstone runs on, as in my case). On the bridge node, it is converted from whatever form it is (e.g. qcow2 or even raw) to raw, and only then it is transfered to CEPH.

So in order to upload, say, a 16GB raw image, at least 3x16 GB of space is needed on the Sunstone node, which is also the CEPH bridge node. So unless OpenNebula is able to detect that the node Sunstone is running on is also a CEPH bridge node, and unless the raw image is recognized as such beforehand instead of “converting” it to raw (again), you need at least two or three times more free disk space in /var/tmp than the largest image you would ever want to upload via Sunstone.

I actually had to uninstall the two OpenNebula physical hosts I had and repurpose them however I may redeploy on two on some Cisco UCS B200 M3’s as a POC. I’ve been using a combination of HAProxy, Keepalived and GlusterFS for Sunstone. I still have the original ON VM which I’ll just reconnect to the blades at the time.

Now I can’t recall if OpenNebula ever used MySQL or PostgreSQL, however I’ve been also using these in Clustered setup via PostgreSQL w/ Patroni + ETCD and MySQL w/ Galera, both in an elastic and redundant HA solution. I can’t remember if it applies to ON now. I think it uses the latter so that will work nicely with OpenNebula if both instances can connect to the same DB.

Regards,
T

Although it will work you should never deploy only 2 frontends on HA configurations, you should go with at least 3 servers in order to have quorum if something bad happens.

Ah yes, left out that I have min 3 nodes of each. (Thanks Sergio)

1 Like