How to architect storage

Hi,

We’re working on deploying Opennebula + KVM/EC2. At the moment we need to consolidate ~4,000+ hosts and 10,000’s of VMs. We’re talking to storage vendors to figure out how to handle this volume. I’m still in the discovery phase but was a little unclear on where the VMs are stored while running. The documentation lead me to believe they’re copied to the node via SSH/other mechanism. If so, what purpose does the NFS mount serve? This will impact how much storage we need and where its mounted/how.

Also, is it possible to import existing VMs on a running KVM node? Imagine we have ~50 VMs on a KVM node. If I setup the opennebula node package, could I add the host and ‘automagically’ see the existing VMs or do they need to be built for Opennebula?

Thanks!

P.S. Do you offer consulting? It would be nice to speak with an expert on our architecture.

Take a look at Ceph object storage, specifically the rbd part.

good place to start is reading the opennebula reference architecture.
https://support.opennebula.pro/hc/en-us/article_attachments/202208405/OpenNebula-Open_Cloud_Reference_Architecture_Rev1.0_20150421.pdf

more info here:
http://opennebula.systems/jumpstart/

OpenNebula distinguishes between images datastore (to maintain a catalog of images) and system datastore (where the disks of running VMs reside). When a VM is deployed, images pass from the image to the image datastore, the mechanism varies depending on the type of backend of the datstore.

For this scale of deployments we usually recommend Ceph, or a multi tier approach based on clusters.

Sure, this feature is being introduced in 4.14, I think is exactly what you are looking after.

Absolutely, OpenNebula Systems offers these kind of services. You can complete this form to request more information.

Thanks for the links. The storage layout was documented but a little ambiguous. I’ll do some more reading.

Great! That’s exactly the clarification I was looking for, thank you!

I’ll definitely keep an eye out for 4.14 then.

Thanks again for your help!

Hi Jeff,

I agree with Tino, don’t use NFS. Ceph is the way to go, we currently have a 50 node Ceph cluster and it’s performing very well.

Hello,

There are a number of datastore drivers in OpenNebula (upstream and addons - http://opennebula.org/addons/catalog/) including StorPool, Ceph, LVM, iSCSI, etc. VMs’ virtual disks are stored in image datastores and metadata like XMLs of the images and saved states go into a System Datastore. The System Datastore is either an NFS mount on all hosts or the files are being copied by scp. This depends on the backend you decide to use. Some backends copy images to and from the system datastore, others like StorPool just place symlinks in it. Some backends are more capable than others so I would definitely test extensively before picking one. The good part is that it is a modular system. You can always switch to another storage system.

I think you probably recognize that the storage load of 10.000 VMs is not an easy thing to handle. We (StorPool) can definitely help with that and provide you a solution better than Ceph.

As for importing existing VMs, as Tino wrote, I also believe Wild VMs are what you are looking for.

Also, if you would like, we can help with advise based on our experience with large scale clouds (both OpenNebula and non-OpenNebula). Just drop us a line.

Best regards,
Sakis