I have a question about storage that hopefully someone can shed some light on.
Well theres a couple actually:
1 - In SSH transfer mode is the VM image saved on the Frontend, if so does it then get copied via SSH to the host where you want to deploy it, when its deployed does this VM file only exist on that Host or does the Frontend also have a copy of the VM file? If so I only need enough space on my fronends datastore to hold the actual images.
2 - I know SSH mode doesn’t allow live migration, so if we use NFS can we share our hosts local storage via NFS to the frontend and then when a VM is deployed it will all be over NFS? this way if we add another Node we can do the same ,so essentially the Front end just mounts NFS shares to the hosts local storage, and then if we did that we could do live migration.
If so are there any performance impacts here I may be missing the obvious.
Otherwise Im really struggling to find an option which will allow use to use the nodes direct storage but also allow live migration.
The SSH transfer mode seems good in terms of speed but I think its a bit of a shame that there is no way to migrate the VM’s without powering them off to do so.
Any help or suggestions would be appreciated here.
About your first question. 2 different datastores will be used for that procedure. Images are hold inside the Images datastore (ID 1 by default). This datastore might be located inside the frontend (filesystem datastore) or in another supported one (nfs, ceph, etc). Once a VM is instantiated, the image will be copied (not moved) to the System datastore (ID 0 by default). This process might happen in different ways depending on the Transfer Manager ™ driver you are using (scp, link, etc).
About your second question, you are refering to shared TM driver.
What you are looking for is something like live migration using block commit. I think this is not supported in OpenNebula at the moment. For increased performance, scalability and the possibility of live migration you should evaluate deploying a Software Define Storage, like Ceph.
Actually, if you have 4 nodes with 100tb each and you want to fully use their space using persistent images, you would need at least 400tb on your frontend. SSH drivers are useful because they allow you to take advantage of your local disk’s BW, but that’s it. The two main drawbacks is that deployment time will be slow and you will need to double space consumption in some cases.
Shared drivers allow you to use NFS datastores and only a link will be created, so you won’t double used space, although your NFS disk’s BW will be a bottle neck. You could use several NFS datastores to improve throughput but I’m sure you are already noticing how hard scalability gets.
A Software Defined Storage won’t have these drawbacks, and recently a new option was added to Ceph datastores that allows you to copy some disks to the compute nodes while running other VMs from the Ceph datastore.
Thanks for explaining that makes sense now, so I’ve had a look at ceph and it looks like a bit of a mission to get setup and working and I need a bit of kit even just to test it out
I’m looking at using glusterfs or lizarsfs now.
If I use glusterfs must I mount each glusterfs server as a separate datastore even if I am using glusterfs fs to create a large pool / distributed pool ?or would I mount just the one then it will decide where to write the files to.
Also what’s performance like running vms from a shared datastore I would be using non persistent images and looking for best way