Problem with second bridge on OVS

Hello,

we have a connectivity problem with second bridge on OVS.
Configuration: we have ovsbr0 bridge on OVS which is used for VM traffic. This bridge has a bond of 2 interfaces (bond0) in it with all the customers VLANs tagged. We use dedicated 10G NIC for ceph storage traffic, and 2 ports are bonded (bond1) and tagged with 2 storage VLANs. The network configuration seems to be ok. We’ve added a bond1.xxx interface and assigned an IP address to it. All works fine (ceph mounts work well on the hypervisors). Now, we’ve added second bridge to OVS named “ceph” and added bond1 interface to the bridge. After attaching new vNIC in VM, adding it to the ceph bridge with tagged VLAN, assigning the IP address to that new vNIC, the connectivity fails between VM and the rest of the network.


We use CentOS 7 on all hypervisors,
OVS : ovs-vswitchd (Open vSwitch) 2.5.2
kernel: 4.16.0-1.el7.elrepo.x86_64
one: 5.4.13
libvirtd: 4.5.0

# ovs-vsctl show
499e925c-fd74-4e40-9f64-50590869111b
    Bridge "ovsbr0"
        Port "one-1053-0"
            tag: 1018
            Interface "one-1053-0"
        Port "bond0"
            Interface "bond0"
        Port "ovsbr0"
            Interface "ovsbr0"
                type: internal
    Bridge ceph
        Port ceph
            Interface ceph
                type: internal
        Port "bond1"
            Interface "bond1"
        Port "one-1053-1"
            tag: 556
            Interface "one-1053-1"
    ovs_version: "2.5.2"

# ovs-appctl fdb/show ceph
 port  VLAN  MAC                Age

ping fails from VM to rest of the network, as well as the other way around, obviously because there are no MAC addresses learned

If we try to trace that ping through ovs, we can see that counter is incrementing when pinging from VM to physical server, and we can see arp requests on bond1 interface also, but no reply.

# ovs-appctl bridge/dump-flows ceph
duration=85393s, n_packets=1290, n_bytes=54180, arp,arp_op=1,actions=FLOOD
duration=87992s, n_packets=296, n_bytes=13056, priority=0,actions=NORMAL
table_id=254, duration=87992s, n_packets=0, n_bytes=0, priority=2,recirc_id=0,actions=drop
table_id=254, duration=87992s, n_packets=0, n_bytes=0, priority=0,reg0=0x1,actions=controller(reason=no_match)
table_id=254, duration=87992s, n_packets=0, n_bytes=0, priority=0,reg0=0x2,actions=drop
table_id=254, duration=87992s, n_packets=0, n_bytes=0, priority=0,reg0=0x3,actions=drop

MTU is set to 9000 everywhere.

Does anyone have a clue where and what to look for? We’ve ran out of ideas…
If You need any other information, please, let us know.

Best Regards!