KVM/ONe Improved host network control


I recently attempted to build an OpenNebula cluster on KVM using HPE BL460s (which only have two ethernet interfaces - especially if you have FC HBAs). The two interfaces were bonded together for HA and bandwidth, and added to a bridge with multiple tagged VLANs hanging off them. Everything worked well. The KVM host was managed by adding it’s IP to the bridge.

Upon deleting the VMs and attempting to redeploy, it became apparent that the bridge is deleted when no VMs require it. This is good practice - but not when the management interface is also on that bridge! :slight_smile:

As a work around, the bonded interfaces now terminate in a bridge, connected to a “management bridge” via Virtual wires (vEth). A second “VLAN” bridge is also connected via vEth. This VLAN bridge gets deleted when there are no more VMs. This also allows for a 3rd bridge with an IP address to be connected for encapsulating any vxlan traffic into it’s own vxtransit VLAN. Nice!

This is, however, a spaghetti of cobbled together components, and would be greatly simplified if OpenNebula could have some awareness of non-managed L3 interfaces in linux bridges, such as management and VTEPs etc. By contrast, vmware have the concept of the vmkernel - an object which sits on a vSwitch or vds_Switch for non-VM traffic such as vmotion, iSCSI and Management, preventing vSwitches and port groups from being deleted.

I think it would be great feature for OpenNebula to support more advanced host networking - even if it just means bridging bridges with vEth, at least this could be deployed and deleted dynamically as-and-when a host needs it.

I have a small mass of scripts and diagrams to share with anyone who wants (well…needs) to recreate this crazy setup :crazy_face:

You could try setting :keep_empty_bridge: true in /var/lib/one/remotes/etc/vnm/OpenNebulaNetwork.conf.

(and as it is a file in “remotes” - I’d re-sync the hosts with su - oneadmin -c 'onehost sync --force' after editing the file)

Best Regards,
Anton Todorov

Thanks Anton - I have ended up doing exactly that where it’s only a management IP required - but it’s insufficient if more complex networking (e.g. nesting vxlans inside a vlan) is required. I suppose I am really just looking for solutions to more granular management of the host networking. Some configuration can be performed when the nodes are built (e.g. cobbler or digital rebar) or an orchestration tool (ansible?) but it would just be fantastic if this level of control could be provided as part of on-boarding a host to OpenNebula(for example) or better still, applied dynamically across the cluster, ideally from Sunstone. :slight_smile:

Hi, have you tried to replace the linux bridge with openvswitch bridge ?
you may have the vmware behavior.

Good suggestion - and yes, I considered it and it makes a big difference - the two challenges I ran into are:

  1. terraform - it wasn’t clear how well openvswitch worked with the opennebula terraform provider.
  2. network security groups don’t appear to be integrated

I should probably explain the eventual specific use-case in a little more detail.
There are a number of existing deployments of our specific app-stack in both AWS and Azure, both driven by Terraform. A lot of effort has gone into creating an architecture that delivers certain key features (especially security and scalability) which can be deployed into these two public clouds with minimal differences between the code bases. The dream is to create an iteration of this terraform code that also can drive an on-premises OpenNebula deployment. Features like Network Security Groups are considered extremely good practice in public cloud, and as such need to translate to an an on-premises deployment. This results in some unusual architectural decisions where the requirement is to deliver AWS/Azure style functionality on a equipment that pre-exists the requirement.

I am aware that this is an extreme-edge case - and as a result having to explore some potentially over-complex network topologies. The vmware comparison is fairly apt - it’s relatively trivial to build a single L2 aggregation and then run the ‘kernel’ services (management, vmotion, storage, vxlan) into individual VLANs - complete with network traffic controls - using NSX to provide vxlans means that Network Security Groups can be used to implement microsegmentation, using vxlans for tennant/overlay traffic, whilst existing VLANs are used within the datacenter itself.

Sorry - I am rambling - the answers given have all been very helpful, and help me to ensure I’ve fully explored the options - when you’re a ‘corner case’ it’s really good to hear other peoples perspectives. At the moment I am leaning towards setting “keep_empty_bridge” and using bgp/evpn to create a vxlan topology over the limited number of physical NICs. Fortunately, storage can be handled over FC.