Setting a template with a perfectly suspended state

Hi, I’m a bit worried if this question might be duplicate.

Is it easy to set an OpenNebula template to be instantiated with a saved(suspended) VM state so that the newly spawned one will instantly be running right before a contextualization without needing to boot from scratch?

Please let me break up the question with a little modification:

  1. Is it possible to make an OpenNebula template that would spawn pre-booted/booting-almost-finished VM?
  2. Can you suggest me the right time to suspend such a simple web service VM for the template?

Best Wishes,
Gunwoo Gim

I just happened to find out there is a virsh command about managedsave, would anyone know the best practice?

   managedsave domain [--bypass-cache] [{--running | --paused}] [--verbose]
       Save and destroy (stop) a running domain, so it can be restarted from the same state at a later time.  When the virsh start command is next run
       for the domain, it will automatically be started from this saved state.  If --bypass-cache is specified, the save will avoid the file system
       cache, although this may slow down the operation.
       The progress may be monitored using domjobinfo virsh command and canceled with domjobabort command (sent by another virsh instance). Another
       option is to send SIGINT (usually with "Ctrl-C") to the virsh process running managedsave command. --verbose displays the progress of save.
       Normally, starting a managed save will decide between running or paused based on the state the domain was in when the save was done; passing
       either the --running or --paused flag will allow overriding which state the start should use.
       The dominfo command can be used to query whether a domain currently has any managed save image.

Hmm, I don’t think this is supported yet.

This would require an image to be deployed with a snapshot. Upon “boot” the snapshot would have to be reverted. Which would basically take the same amount of time / even longer as the whole RAM has to be loaded from disk.

You might open a ticket for the OpenNebula team to ask them about such a feature but I think the deployment of VM’s is already insanely fast on a good storage backend.

Some tips to increase VM deployment performance:

  • use a fast storage backend with ~150-500 MB/s R/W and IOPS performance (SSD’s or HDD RAID10 for example)
  • when building your images, make sure they are as small as possible (not installing any software that would not be needed in every instance)
  • Don’t store your images and your system data on the same storage (except it’s insanely fast) as OpenNebula copies over the image from your image store to your system store when deploying an image (!).
  • After building an image, compress it: qemu-img convert -cpO qcow2 /PATH/TO/PREPARED_IMAGE.qcow2 /tmp/YOUR_COMPRESSED_IMAGE.qcow2 after that import the compressed image into your image store and assign your compressed image to your template, this will greatly improve the deployment speed as your images empty space is “cut away” (compressed). For instance, in my setup, I can compress a fully fledged Ubuntu Server 16.04.1 image from ~5 GB down to 1.3 GB after initial setup. (Keep in mind compression is not possible on raw images and should not be used with persistent images) (This allows me to deploy a Ubuntu VM within about 15-20 seconds (including GRUB delay!).

I hope this helps!

A quick sample in my setup:

(in this setup the VM image is copied from an SSD to a HDD LVM striped volume on a bunch of SW RAID1) ~150-300 MB/s RW.

1 Like

I really appreciate the sharing of your practical benchmarks, Bent_Haase.

I think the feature would be very cool for my team though. I guess it would reduce the deployment time by more than 30%. Please let me explain what made me interested in and plan to implement this feature.

I’m using a ceph cluster for the datastore so the disk image copy process would be instant; If I’m not mistaken the default driver for ceph datastore will just make a CoW clone using the parent snapshot.

And my team is getting 10+ Gigabit networks so the process of copying the memory dump wouldn’t be too long.

Best Regards,
Gunwoo Gim

Fair enough! I recommend opening a ticket about that then / suggest it officially.

Depending on the CEPH storage speed deployment speed will be instant either way I’d say.

Do you have any info’s about R/W performance or IOPS of that cluster?

Regarding my setup -> It’s a pretty low-end homelab, still, I am really happy about the speeds already. Dreaming about my own CEPH cluster and 10 GBit local network on the other side :open_mouth: