Bridge is losing packets

We are running OpenNebula 5.8.1 on Ubuntu 18.04 and have the following issue:

  • We have created a backend network using 802.1q.
  • We have created a FW that connects the backend with the internet (NAT/portforward).
  • The FW has the IP spoofing protection removed on the backend interface.
  • We have systems on the backend that use the FW to access the internet.

The backend bridge is called onebr0. A machine queries the internet the packet will traverse the bridge, is NAT, the answer is received and the FW sent the packet back to the backend machine.

Here comes the problem:
While the packet is sent out of the virtual interface of the FW vm one-XXX-1, it never reaches the bridge (onebr0). I check if the packet is dropped by iptables, but do not see any drops, nor do i see the packet traversing the FORWARD chain.

My suspicion is that the packet gets dropped before it is process by iptables, however I have no idea why.

Any idea what could cause this issue? (Many thanks in advance, I am out of ideas at the moment …)

Are you sure the IP spoofing is disabled?

you should check for a match issuing

sudo ipset save | grep one-${VMID}-${NIC_ID}-ip-spoofing

In the domain XML of the FW VM you should have no filter definition too

Best Regards,
Anton Todorov

I am pretty sure.
root@n03:~# ipset save | grep one-214-1-ip-spoofing does not return anything.

Here is a sample of a packet trace from both the bridge (onebr0) and the internal (one-214-1) virtual interface of the FW vm. The FW vm has a security group that allows everything on the internal interface:

root@n03:~# iptables -L -n
...
Chain one-214-1-i (1 references)
target     prot opt source               destination
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            state RELATED,ESTABLISHED
RETURN     all  --  0.0.0.0/0            0.0.0.0/0
DROP       all  --  0.0.0.0/0            0.0.0.0/0

Chain one-214-1-o (1 references)
target     prot opt source               destination
DROP       all  --  0.0.0.0/0            0.0.0.0/0            MAC ! 02:00:0A:14:00:01
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            state RELATED,ESTABLISHED
RETURN     all  --  0.0.0.0/0            0.0.0.0/0
DROP       all  --  0.0.0.0/0            0.0.0.0/0
...

root@n03:~# tcpdump -i one-214-1 -n
08:54:29.127809 IP 10.20.2.1.44932 > 144.76.0.164.123: NTPv4, Client, length 48
08:54:29.141834 IP 144.76.0.164.123 > 10.20.2.1.44932: NTPv4, Server, length 48
08:54:32.700534 IP 10.20.0.3.41569 > 10.20.0.1.53: 18993+ PTR? 11.0.0.127.in-addr.arpa. (41)
08:54:32.702491 IP 10.20.0.1.53 > 10.20.0.3.41569: 18993 NXDomain* 0/1/0 (100)

root@n03:~# tcpdump -i onebr0 -n
08:54:29.127675 IP 10.20.2.1.44932 > 144.76.0.164.123: NTPv4, Client, length 48
08:54:32.700473 IP 10.20.0.3.41569 > 10.20.0.1.53: 18993+ PTR? 11.0.0.127.in-addr.arpa. (41)
08:54:32.702491 IP 10.20.0.1.53 > 10.20.0.3.41569: 18993 NXDomain* 0/1/0 (100)

As you can see on the NTP request, the answer packet sent by the virtual FW never makes it to the bridge device. It is sent out by the FW (one-214-1) but disappears afterwards.
Normal communication on the same subnet works (DNS request right afterwards). So it seems that it might be the ip spoofing rule but there is none, nor is there any filtering (security group, see above in iptables extract) … :-/

Weird indeed.

Another issue I’ve met was with NIC VF interfaces - needed spoof checking disabled(spoof checking off) and trust enabled (trust on) on the VFs.

Not sure what NIC you use but it worth checking for such culprits …

Best,
Anton Todorov

We have Mellanox-X3 cards but we do not have VFs. We use VLANs for the different bridges and use the interface also for backend communication between the compute nodes, controller and storage (different VLAN).

root@n03:~# brctl show
bridge name     bridge id               STP enabled     interfaces
br0             8000.0002c9xxxxxx       no              enp16s0
br1             8000.e41f13xxxxxx       no              eno2
                                                        one-214-0
onebr0          8000.0002c9xxxxxx       no              enp16s0.1001
                                                        one-214-1

Hi,

I have found the issue. It is caused by an also applied ebtables rule set.
I do not understand why this is also in place.
Extract:

Bridge chain: I-one-296-1-ipv4-ip, entries: 3, policy: ACCEPT
-p IPv4 --ip-src 0.0.0.0 --ip-proto udp -j RETURN
-p IPv4 --ip-src 10.20.0.1 -j RETURN
-j DROP

After modify the DROP, the NAT works as expected.

I also saw that “NIC = [ filter = “clean-traffic”, model=“virtio” ]” was set in /etc/one/vmm_exec/vmm_exec_kvm.conf.
The vnet ist set to VN_MAD=“802.1Q”

Any idea what might cause this?

The issue is still persists. To my understanding these ebtables rules are automatically created by libvirt.
The templates are in /etc/libvirt/nwfilter. It seems that OpenNebula orchestration does not really verifies these rules.
So while iptables ip spoofing is not present, libvirt is still blocking traffic.

This is a sample for a new VM:
Bridge chain: libvirt-I-one-392-0, entries: 9, policy: ACCEPT
1. -j I-one-392-0-mac
2. -p IPv4 -j I-one-392-0-ipv4-ip
3. -p IPv4 -j ACCEPT
4. -p ARP -j I-one-392-0-arp-mac
5. -p ARP -j I-one-392-0-arp-ip
6. -p ARP -j ACCEPT
7. -p 0x8035 -j I-one-392-0-rarp
8. -p 0x835 -j ACCEPT
9. -j DROP
...
Bridge chain: I-one-392-0-ipv4-ip, entries: 3, policy: ACCEPT
1. -p IPv4 --ip-src 0.0.0.0 --ip-proto udp -j RETURN
2. -p IPv4 --ip-src 10.20.0.2 -j RETURN
3. -j DROP

I will modify the libvirt filter now, but I would be great if opennebula can handle this issue as well.