I am sure this must be sort of a FAQ, but can’t figure it out and maybe I’m searching with the wrong keywords.
My use-case is really simple: I store my template disk images in a mount which goes on my very small NVMe drives. When I instantiate a persistent VM, I want the image to be cloned to either my SSD datastore or SATA datastore.
Right now, when I instantiate the template, the image will always be cloned in the templates datastore, and I have to manually move it out.
Is this something which can be done in Sunstone so I can skip the additional step? I could do either with setting a different default for instantiating a given template, or choosing at launch. There seem to only be an option to select the system datastore, but I only have one anyway.
Check the datastore internals section on each datastore driver for details. The VM disk placement depends on several factors: persistency, storage driver, deployment mode.
Thanks for the reply! I have three local datastores (backed by different ZFS Pools, but this is underneath OpenNebula) for images - one for templates I download/create, two for actual VMs.
The page you linked I’m familiar with, but unless I’m missing something doesn’t explain how datastores are picked for launching a new VM. Some other threads and docs have sent me to /etc/one/sched.conf where the following should do the trick:
# DEFAULT_DS_SCHED: Definition of the default storage scheduling algorithm
# - policy:
# 0 = Packing. Tries to optimize storage usage by selecting the DS with
# less free space
# 1 = Striping. Tries to optimize I/O by distributing the VMs across
# datastores.
# 2 = Custom.
# - rank: Custom arithmetic expression to rank suitable datastores based
# on their attributes
# 3 = Fixed. Datastores will be ranked according to the PRIORITY attribute
# found in the Datastore template.
The challenge is my policy is set to 1, which means I should see VMs landing randomly in the three datastores (which wouldn’t work anyway for me) - but doesn’t seem to match what I see (VMs consistently land in the templates datastore).
Do you have any further pointer or idea? Thankyou!
That is the default policy the scheduler chooses when a VM is created. However, you can define a scheduling policy on a per VM basis. Check the VM template scheduling reference which also comes with examples in order to have a clearer picture.
Cannot dispatch VM: No system datastore meets capacity and SCHED_DS_REQUIREMENTS: ("CLUSTERS/ID" @> 100) & ( NAME="tier1-images" )
Which I think implies this can only allow me to se the system datastore - not the images datastore where the main image will be cloned (although this is not explicitly documented).
Image datastores hold the images that VMs will use eventually. Once a VM uses that image there is a transfer to a System Datastore. Simply put you cannot really use an image datastore as a target for a deployment because design. System Datastore represents the hypervisor storage.
Hmm if this is the case then there’s something broken in my setup. Can you confirm the below makes sense to you:
1 - When I pull an image from the MarketPlace, I have to put it in an images datastore. In my case it’s the datastore “templates”.
2 - If I want to create a VM from that image, I will navigate to Templates (or use the corresponding CLI), and instantiate the VM as persistent or not.
3a - If I instantiate the VM as NON persistent, the image stays where it is and a CoW image is created in the system datastore to hold its data. When the VM is terminated, data is trashed.
3b - If I instantiate the VM as persistent, what happens here is the image (which is in the templates DS) is cloned (in the same DS), tagged as persistent and then used. Data will always live there.
The 3b behaviour I don’t think depends on any configuration and makes sense to me (as persistent volumes are in images datastores too, even when in use). What I’m asking is that in that case the image is cloned to another images datastore instead than the same images datastore.
Here’s how a persistent VM looks like in my system datastore:
/var/lib/one/datastores/100/140:
total 62K
drwxr-xr-x 2 oneadmin oneadmin 30 Nov 26 09:26 .
drwxr-xr-x 10 oneadmin oneadmin 10 Nov 26 13:50 ..
-rw-r--r-- 1 oneadmin oneadmin 2.1K Aug 5 2022 deployment.0
[...]
-rw-r--r-- 1 oneadmin oneadmin 2.1K Jan 25 2023 deployment.9
lrwxrwxrwx 1 oneadmin oneadmin 67 Aug 5 2022 disk.0 -> /var/lib/one/datastores/101/fc407cbbe61adb5c71328a6a4f0ec4f4.snap/0
lrwxrwxrwx 1 oneadmin oneadmin 67 Aug 5 2022 disk.1 -> /var/lib/one/datastores/102/41508e219b3e66ff5d6ea617f31b5f6e.snap/0
-rw-r--r-- 1 oneadmin oneadmin 364K Nov 26 09:26 disk.2
-rw-r--r-- 1 oneadmin oneadmin 963 Nov 26 09:26 ds.xml
-rw-r--r-- 1 oneadmin oneadmin 8.8K Nov 26 09:26 vm.xml
As you see disk.0 and disk.1 are simply symlinked to the related images datastore. disk.1 I manually created, but disk.0 is exactly where OpenNebula put it when I instantiated the VM.
hey, this confused me as well, at first.
First, an image lives in the image datastore when not in use or when downloaded from the marketplace.
When you instantiate a VM from it, a snapshot is created and linked to the system datastore. So now opennebula knows the image is in use and has a VM attached to it. So for a persistent disk attached to a VM you’ll see the symlink being created to the image in the image datastore.
For non-persistent VM’s it works a bit different; the VM will start with the disk from the image datastore and an extra disk is created containing the changes that the VM’s makes. So when the VM is stopped, the original image has not changed it contents, all changes are stored in the second disk image.
When you undeploy that VM, the second disk gets thrown away, and the original non-persistent image stays the same. So a default Ubuntu image will stay the default ubuntu image, if you install something, for example apache, that gets added to the second image, so the original ubuntu image is not changed at all.
Hope this makes sense - and to the rest of the forum - please correct me if I’m wrong
I’m aligned with the above, that’s my understanding of the docs and what I observe in production. Unsure why ONE would let me chose where to store the VM .xml and its symlinks (system datastore) and not where to put the actual data. Cases of 2-tiered storages like my lab shouldn’t be so rare?
You don’t really choose where to put your XMLs or symlinks, rather what you choose is where to deploy the VM on the hypervisor filesystem. The path you choose is the system datastore mountpoint + the vm_id directory. In there, everything that is required for the VM to live
disks
context ISO
hypervisor definition file (libvirt XML in the KVM case)
possible snapshots that might ocurr during the VM lifetime
etc.
So you don’t really have a templates datastore
In the case of the ceph driver, for example, the VM disk lives at the ceph cluster level, yet you still need this filesystem path to hold the rest of the dynamically generated files to pass to the hypervisor. You effectively have a 2 tier storage.
The use case of two different mountpoints for multi tiered storage could be added as an extension of existing drivers with an extra parameter. If you feel like this could be a useful feature, please open a feature request.
Regardless of that you can always automate what you currently do manually with the hook subsystem using VM state hooks.
I think I got what you mean - but don’t think you’re considering the case where a VM is instantiated as “persistent”. In that case, as per directory listing above, there is no disk in the system datastore at all. The VM’s data (as in, data stored by the VM’s operating system) lives on a clone of the template’s image, stored in the same datastore where the template was.
@VURoland has confirmed his system behaves the same way, so - to achieve what I want, it would seem to me the only option is to create multiple system datastores on the different zpools, instantiate the VMs as NON persistent. This way the “actual data” would be in the system datastore.
To me though, it seem counter-intuitive to instantiate a VM as non persistent - is this general practice?
my default VM’s are 99% persistent, as I need the VM to perform a task for a longer time, but I have found non-persistent VM’s to be really useful for some cases like:
a class where 30 students need a personal VM. We can prepare an image with a certain dataset and tools, then mark that image as non-persistent. Now we can use that single VM-image and instantiate 30 VM’s from it, all with their own IP addresses etc. (opennebula takes care of all that stuff.) Students can play with the software and data, and if someone f*cks up the content, we can just kill the VM and deploy a fresh one with the same content. Super-handy.
sometimes we need to process a LOT of data with parallel processing, so we build a VM with all the needed software and then deploy lots of VM’s that are identical. We mount the external storage with the necessary dataset and make them process it. When processing is over, we kill all the VMs and save the single non-persistent disk-image for later use (if needed). So instead of needing terabytes of space for all the images, we can start lots of VM’s using a single couple-of-GB image on the storage.