OpenNebula on ZFS - addon for multiple hosts

Reposting from reddit r/sysadmin
Btw r/opennebula seems dead - no posts for one year…

I am considering to run our OpenNebula so that all images are stored in ZVOLs on ZFS. Currently our OpenNebula runs on 70 hosts and we will be upgrading it, as it runs a very outdated version already. We are using qcow2 delta images. The VMs are relatively short-lived and there is only around 50 base images in the system overall and adding more is not easily possible due to disk space limitations. When a base image is updated, it is more work to start using it than I would like to, because due to the disk space limitations older images need to be removed before new are added. I believe that is also the reason why the base images smell - they do not get the fixes they should get at adequate pace.

I have found the addon for ZFS and played around a bit. It seems to be implemented so that it only runs on single host. Seems to work fine but I did not find a simple way how to make it work for multiple hosts

My idea of using ZFS in ONE is as follows:

  • Base images are snapshots of ZFS ZVOLs
  • VM images are created as a clone of the ZFS snapshot

This would solve the problem with modifying the base images. After modifying the base image, new snapshot on the ZVOL would be created and distributed to all hosts. ZFS send|receive commands are very efficient at this. Also old images would not have to be removed when only small changes are made as the disk space utilization would only grow by the delta between the new snapshot and what already was in the image.

Using clones for the VMs as raw disks will give us features similar to qcow2 delta images, only more efficient and it should be easier to use

Most of our hosts have 2TB of storage without any redundancy and we do not intend to change that, or at least not at this moment. If the disks fail, the users understand they loose their VMs.

Our OpenNebula is split into two clusters also because of disk space limitations. Implementing ZFS properly could also allow us to merge everything in to a single cluster, reducing complexity and simplifying the deployment.

What are your thoughts about implementing this in OpenNebula? Is this doable with only an addon which would provide our way of using the ZFS as datastore and a transfer mechanism to make the images (snapshots) available across OpenNebula? Has any one done anything similar before and is willing to share insight or even code?

1 Like

Welcome, @Martin_Palecek!

PS - Who needs Reddit when we’ve got such a nice Community Forum?? :stuck_out_tongue_winking_eye:

I have started hacking a bit. So far I have modified the addon-zfs from @kvaps to use snapshots for the images. Just a slight cp and rm modification and not at all final code, only prototyping. The images reside under one ZFS subvolume as follows:

root@one:/var/lib/one/remotes/tm# zfs list -t snapshot |grep one-                         
one/images/focal-server-cloudimg-amd64-disk-kvm.img@one-42       0B      -     1.28G  -   
one/images/u-20.04-server-home-made.qcow2@one-43                 0B      -     2.41G  -   
one/images/ubuntu-20.04-minimal-cloudimg-amd64.img@one-41        0B      -      691M  -   

The one/images volume is synced across all hosts without nebula knowing and having to do anything.

Then I have modified the shared TM to do the zvol clones when required. The clones are located under another ZFS subvolume:

root@one:/var/lib/one/remotes/tm# zfs list |grep vm-devices                       
one/vm-devices                                       11.8M   488G       96K  none 
one/vm-devices/vm-36                                 11.6M   488G      695M  -    
one/vm-devices/vm-42                                  144K   488G     2.41G  -    

A script takes care of zfs send|zfs receive of one/images from the frontend to the backends. So just like on the frontend I can make clones on another hosts:

root@one-1:~# zfs list |grep vm-devices
one/vm-devices                                        240K   488G       96K  none
one/vm-devices/vm-43                                  144K   488G     2.41G  -

Some symlinks work as glue:

root@one-1:~# ls -l /var/lib/one/datastores/0/43/
total 531
-rw-rw-r-- 1 oneadmin oneadmin 1786 Jul 19 20:15 deployment.3
-rw-rw-r-- 1 oneadmin oneadmin 1786 Jul 19 20:17 deployment.4
lrwxrwxrwx 1 root root 30 Jul 19 20:15 disk.0 -> /dev/zvol/one/vm-devices/vm-43
-rw-r–r-- 1 oneadmin oneadmin 42949672960 Jul 19 20:15 disk.1
-rw-r–r-- 1 oneadmin oneadmin 372736 Jul 19 20:17 disk.2

I realize this is hackish, but in principle this seems fully sufficient for what I wanted to do.
More code needs to be modified, so far only image creation and VM instantiation works. Other operations are probably broken.

Ideas? Thoughts? I would appreciate feedback from an experienced OpenNebula admin/developer.