'static' vxlans

An unusual suggestion:

vxlans rely on UDP MultiCast to limit layer 2 broadcasting.
This works nicely on relatively simple network topologies, or if you have nice, powerful switches that support lots of multicast groups. However, a multi-tennant environment with an allocation of vxlans-per-client is likely to generate a lot of vxlans, and on OpenNebula each vxlan is assigned to it’s own Multicast group.

With millions of vxlans available, this doesn’t scale brilliantly.

Suggestion 1: Best for scale-out
Allow consolidation of vxlans into Multicast groups. This could be grouped by “tennant”, constraining multicast traffic only to hosts with that tennant’s VMs running (even if the specific vxlan isn’t required, this would still significantly reduce the number of multicast groups, as well as still constraining traffic).

Suggestion 2: Best for smaller tennants on larger networks
‘head-end’ replication: in a software-defined environment, this is a highly elegant solution - set a target number of vxlans per vtep, and then as VMs are deployed / migrate the OpenNebula management plane updates the bridging information across the cluster eg:
bridge fdb append 00:00:00:00:00:00 dev vxlan100 dst 2001:db8:2::1
bridge fdb append 00:00:00:00:00:00 dev vxlan100 dst 2001:db8:3::100

Whilst this isn’t optimal for vxlans with hundreds of participating hosts (each host must send each frame to each listed host) in a multi-tennant environment the tennants traffic is likely to be constrained to a smaller number of hosts.
This removes the need for switches to actively partake in IGMP snooping and general Multicast behaviour - instead UDP traffic is sent point-to-point between hosts. I accept this adds significant load to the host interface (frames = traffic x number of participating hosts) but with 40Gbs and 100Gbs networking now approaching the commodity end of the market, this seems less and less of an issue than managing and fault-finding dynamic multicast groups.

It might also prove to be an extremely powerful tool for anyone building AWS environments - as I believe it would allow the construction of virtual networks between AWS hosts.

Hi Steve,

We kind of considered updating the FDB tables on the linux bridges within the drivers, but ended providing some support to BGP EVPN http://docs.opennebula.io/5.12/deployment/open_cloud_networking_setup/vxlan.html#using-vxlan-with-bgp-evpn. This should be more scalable but there are some heavy lifting on the deployment side to setup the FRRouting.

On the other hand you are right that this approach could be the best one when using AWS hosts/VMs as hypervisors…

:crazy_face: BGP VPN got introduced two versions ago, and somehow I missed it!
Thanks Ruben - I can’t believe I didn’t spot that appear in 5.8 - sorry!

BGP eVPN definitely scales well - as you say it does look like the setup is not especially straightforward.
I’ve been using VM_hooks to run shell scripts to update the FDB tables, but it’s a bit of a sketchy mess trying to do it dynamically - but I would imagine it could be pretty straightforward storing this information in the management plane database and apply it to hosts when VMs are deployed. I appreciate that this a lot of work for an edge-case, though - with eVPN available there is a lot less necessity. Thank you for your quick and helpful answer Ruben.

Glad it helped you :slight_smile:

It is true that there is no automation to deploy the BGP router :frowning: Maybe this is something we could look at like providing some ansible recipes + markeplace image… I imagine that keeping track of all the FDB entries you added in the hooks and debugging a not responding ping can become a difficult task quickly…

Cheers