New deployment advice

We currently have 3 CentOS7/KVM servers and adding 2 more. There are four IBM xSeries servers and two Supermicro, all with SSD storage as JBOD and Linux raid and ranging from 1-2 TB. Each server has 16 to 96 GB of RAM. The four currently in production have all been managed independently via virsh and virt-manager to this point and I was interested in giving OpenNebula a try as we bring the two other IBM servers into production or should I with this relatively small infrastructure?

One of my biggest questions is what’s the best way to setup the Virtual Network(s)? Currently, we have two network subnets managed by the same router. We have three subnets assigned, a small /29 in which the data center connects to our router WAN port with a /26 on one LAN port and /27 on another. Basically, we keep our own network of servers on the /26 and co-locations on the /27. I have never used VLANs but both of our two switches on the network support VLAN. All the IBM servers have 4 interfaces while the Supermicro has two. Please, if anyone has input or knows of any docs or guides related specifically to virtual network setup or VLANs in ONE.

The other question I have is related to storage. I’ve been reading that GlusterFS is no longer recommended. I should use Ceph for distributed storage? Not even sure if I should go to a distributed file system or use LVM datastores on each host. I tried NFS storage ages ago before we migrated to KVM from ESXi and had issues with speed, never tried a distributed file system, our switches are Gigabit. With Ceph, should I change the chunk size to something larger than the default 64K or any other advice or what to look out for? I have this doc for setup of Ceph:

https://www.howtoforge.com/tutorial/how-to-build-a-ceph-cluster-on-centos-7/

My plan at this point is to get the MySQL backend setup and using these docs for deployment:

http://docs.opennebula.org/5.2/deployment/index.html

Thanks for any guidance on these topics!

I don’t recommend Ceph unless you can throw a lot of hardware at it. We have a 4 node Ceph cluster attached to our 5 node OpenNebula cluster and after a certain number of VMs we get astronomically high iowait times on the VMs. We’ve played with tuning Ceph six ways from Sunday and just can’t get it to perform well with only 4 nodes and GbE networking. The network is not saturated and I know the hardware for the Ceph nodes is not optimal. With Ceph you are also looking at a non-trivial administrative burden over time.

We’re in the process of moving all our VMs to local storage on the hypervisors.

Yeah, nothing like local storage in my opinion and previous experience trying network storage. Local storage with SSD drives just flies. I have not tried LVM either, just plain old virtual disks on the local file system so far. Is slicing off LVMs for the VMs much better?

Hello!

Have you looked at StorPool? I love it. It blows Ceph out the water.

Thanks!