We’re working on deploying Opennebula + KVM/EC2. At the moment we need to consolidate ~4,000+ hosts and 10,000’s of VMs. We’re talking to storage vendors to figure out how to handle this volume. I’m still in the discovery phase but was a little unclear on where the VMs are stored while running. The documentation lead me to believe they’re copied to the node via SSH/other mechanism. If so, what purpose does the NFS mount serve? This will impact how much storage we need and where its mounted/how.
Also, is it possible to import existing VMs on a running KVM node? Imagine we have ~50 VMs on a KVM node. If I setup the opennebula node package, could I add the host and ‘automagically’ see the existing VMs or do they need to be built for Opennebula?
Thanks!
P.S. Do you offer consulting? It would be nice to speak with an expert on our architecture.
OpenNebula distinguishes between images datastore (to maintain a catalog of images) and system datastore (where the disks of running VMs reside). When a VM is deployed, images pass from the image to the image datastore, the mechanism varies depending on the type of backend of the datstore.
For this scale of deployments we usually recommend Ceph, or a multi tier approach based on clusters.
There are a number of datastore drivers in OpenNebula (upstream and addons - http://opennebula.org/addons/catalog/) including StorPool, Ceph, LVM, iSCSI, etc. VMs’ virtual disks are stored in image datastores and metadata like XMLs of the images and saved states go into a System Datastore. The System Datastore is either an NFS mount on all hosts or the files are being copied by scp. This depends on the backend you decide to use. Some backends copy images to and from the system datastore, others like StorPool just place symlinks in it. Some backends are more capable than others so I would definitely test extensively before picking one. The good part is that it is a modular system. You can always switch to another storage system.
I think you probably recognize that the storage load of 10.000 VMs is not an easy thing to handle. We (StorPool) can definitely help with that and provide you a solution better than Ceph.
As for importing existing VMs, as Tino wrote, I also believe Wild VMs are what you are looking for.
Also, if you would like, we can help with advise based on our experience with large scale clouds (both OpenNebula and non-OpenNebula). Just drop us a line.