Connectivity problem with Linux VMs, Windows working fine

Hi,

I’ve been dealing with a very rare connectivity issue:

I have created in Opennebula the VLAN 30 (VN_MAD 802.1Q, PHYDEV enp5s0f1 ) in Ethernet mode, because the DHCP is the router.

I’ve attached this network to a Windows Server VM in Opennebula, and all is working fine (connectivity is OK, DHCP working, etc…).

Then, I’ve created some Ubuntu VM, and attached to the same network. The VM seems to be working fine and even gets an IP address from my external DHCP, but I can’t ping or SSH to the VM. I’m attaching a screenshot where you can see how the VM has Reception but not Transmission.

Of course Security Group is the same for all VMs (also they have the same network). And of course the subinterface of the host is correct:

enp5s0f1.30 Link encap:Ethernet HWaddr 18:a9:05:41:ca:4f
inet6 addr: fe80::1aa9:5ff:fe41:ca4f/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:28432 errors:0 dropped:0 overruns:0 frame:0
TX packets:25263 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:37414700 (37.4 MB) TX bytes:3566314 (3.5 MB)

  • It’s normal that Opennebula “can’t see” the IP obtained by DHCP by the VM? I think this is the normal behaviour but I’m not sure.

  • Why is this happening only with Ubuntu VMs and not with the other VM?

Morning Mistral!

If using an Ethernet range for your virtual network, you’re right, that’s the normal behavior, OpenNebula won’t show the IP address assigned by DHCP to a VM.

As you have the same configuration for both kind of VMs and as they’re in the same host I don’t know what could be wrong. Something may be preventing ARP to work in the Ubuntu VM so it can’t respond to ping and ssh traffic for this VM.

Can you run and paste a onevm show VMID -x where VMID is the ID of the Ubuntu VM and run the iptables-save command in the host where the VM is running so we can check if any rule is not working as expected?

Cheers!

The behavior seems to be different than I first reporte:

For some VMs, the DHCP works fine and they get IPv4. But for some others, they just don’t (I’ve tried with different images from Opennebula: CentOS, Ubuntu,…).

I’ve run a Wireshark capture on the interface of my DHCP server and what I’ve seen in all the cases that don’t get an IP, is they are trying to get an IPv6 instead. This is all the traffic the VM generates:

I’ve also run the command you ask for this VM:

$ onevm show 70
VIRTUAL MACHINE 70 INFORMATION
ID : 70
NAME : TEST_centos_IPV4espero
USER : marcel
GROUP : oneadmin
STATE : ACTIVE
LCM_STATE : RUNNING
RESCHED : No
HOST : 192.168.1.116
CLUSTER ID : 0
CLUSTER : default
START TIME : 11/05 00:11:18
END TIME : -
DEPLOY ID : one-70

VIRTUAL MACHINE MONITORING
CPU : 0.0
MEMORY : 2G
NETTX : 0K
NETRX : 1K

PERMISSIONS
OWNER : um-
GROUP : —
OTHER : —

VM DISKS
ID DATASTORE TARGET IMAGE SIZE TYPE SAVE
0 images_min vda CentOS 7.2 - KVM 51M/27G file NO
1 - hda CONTEXT 1M/- - -

VM NICS
ID NETWORK BRIDGE IP MAC PCI_ID
0 VLAN_ASB onebr.40 - 02:00:5e:6f:04:67

SECURITY

NIC_ID NETWORK SECURITY_GROUPS
0 VLAN_ASB 0

SECURITY GROUP TYPE PROTOCOL NETWORK RANGE
ID NAME VNET START SIZE
0 default OUTBOUND ALL
0 default INBOUND ALL

VIRTUAL MACHINE HISTORY
SEQ HOST ACTION DS START TIME PROLOG
0 192.168.1.116 poweroff 105 11/05 00:11:30 0d 00h01m 0h00m01s
1 192.168.1.116 poweroff 105 11/05 00:13:17 0d 00h01m 0h00m00s
2 192.168.1.116 poweroff 105 11/05 00:15:58 0d 00h03m 0h00m00s
3 192.168.1.116 poweroff 105 11/05 00:20:17 0d 00h04m 0h00m00s
4 192.168.1.116 none 105 11/05 00:26:24 0d 00h06m 0h00m00s

USER TEMPLATE
LOGO=“images/logos/centos.png”
SCHED_MESSAGE=“Sat Nov 5 00:11:24 2016 : No host with enough capacity to deploy the VM”

VIRTUAL MACHINE TEMPLATE
AUTOMATIC_DS_REQUIREMENTS="“CLUSTERS/ID” @> 0"
AUTOMATIC_REQUIREMENTS="(CLUSTER_ID = 0) & !(PUBLIC_CLOUD = YES)"
CONTEXT=[
DISK_ID=“1”,
ETH0_CONTEXT_FORCE_IPV4="",
ETH0_DNS="",
ETH0_GATEWAY="",
ETH0_GATEWAY6="",
ETH0_IP="",
ETH0_IP6="",
ETH0_IP6_ULA="",
ETH0_MAC=“02:00:5e:6f:04:67”,
ETH0_MASK="",
ETH0_MTU="",
ETH0_NETWORK="",
ETH0_SEARCH_DOMAIN="",
ETH0_VLAN_ID=“40”,
ETH0_VROUTER_IP="",
ETH0_VROUTER_IP6="",
ETH0_VROUTER_MANAGEMENT="",
NETWORK=“YES”,
SSH_PUBLIC_KEY="",
TARGET=“hda” ]
CPU=“1”
GRAPHICS=[
LISTEN=“0.0.0.0”,
PORT=“5970”,
TYPE=“vnc” ]
MEMORY=“2048”
OS=[
ARCH=“x86_64” ]
TEMPLATE_ID=“38”
VMID=“70”

It’s all very strange actually, any ideas?

What I want to try next is using Open vSwitch instead of 802.1q, any good tutorial or article about how to integrate it with Opennebula?

Thanks again.

Hi!
According to the onevm show command the default security group is being used so it wouldn’t affect DHCP traffic as it allows any inbound and outbound traffic.

The iptables-save command executed in your hosts would be helpful to check if any iptables rule would be blocking ARP traffic somehow and you can also read this recent post where I discussed a basic network troubleshooting.

If you want to use openvswitch, I’d do the following:

  1. Install openvswitch in your hosts. Some distros has openvswitch in their repositories others require more work. What distro are you using? I could try to help you with CentOS.
  2. You’ll need to create one or more bridges which use openvswitch in any of your hosts as explained here.
  3. You’ll configure virtual networks which uses those bridges as explained here. You can set VLANs with OpenNebula.

Next steps would depend on your scenario, e.g you may need VXLAN or tunnels if hosts are not in the same LAN or you may need an OpenFlow controller like OpenDayLight if you want to set a more sofisticated environment. So we can update this post.

I can help you to set this and we could make the documentation better together.

Cheers!

First of all, thanks for the help Miguel. If I can help improving this part of the documentation once this issue solved, just let me know.

The frontend and all the nodes are on the 192.168.1.x/24 segment (untagged).

The VMs are in VLAN 30. So, they are using onebr.30 which has been autogenerated by Opennebula 802.1q option.

This is the output of brctl show:

This is the output of iptables-save command:

$ sudo iptables-save

*mangle
:PREROUTING ACCEPT [651905:1891961076]
:INPUT ACCEPT [637816:1890609605]
:FORWARD ACCEPT [18072:2618634]
:OUTPUT ACCEPT [615319:1295103603]
:POSTROUTING ACCEPT [633391:1297722237]
-A POSTROUTING -o virbr0 -p udp -m udp --dport 68 -j CHECKSUM --checksum-fill
COMMIT

*nat
:PREROUTING ACCEPT [3417:433709]
:INPUT ACCEPT [112:23452]
:OUTPUT ACCEPT [1401:207645]
:POSTROUTING ACCEPT [4706:617902]
-A POSTROUTING -s 192.168.122.0/24 -d 224.0.0.0/24 -j RETURN
-A POSTROUTING -s 192.168.122.0/24 -d 255.255.255.255/32 -j RETURN
-A POSTROUTING -s 192.168.122.0/24 ! -d 192.168.122.0/24 -p tcp -j MASQUERADE --to-ports 1024-65535
-A POSTROUTING -s 192.168.122.0/24 ! -d 192.168.122.0/24 -p udp -j MASQUERADE --to-ports 1024-65535
-A POSTROUTING -s 192.168.122.0/24 ! -d 192.168.122.0/24 -j MASQUERADE
COMMIT

*filter
:INPUT ACCEPT [418608:883966042]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [372290:1213476376]
:one-44-0-i - [0:0]
:one-44-0-o - [0:0]
:one-44-1-i - [0:0]
:one-44-1-o - [0:0]
:one-53-0-i - [0:0]
:one-53-0-o - [0:0]
:one-70-1-i - [0:0]
:one-70-1-o - [0:0]
:opennebula - [0:0]
-A INPUT -i virbr0 -p udp -m udp --dport 53 -j ACCEPT
-A INPUT -i virbr0 -p tcp -m tcp --dport 53 -j ACCEPT
-A INPUT -i virbr0 -p udp -m udp --dport 67 -j ACCEPT
-A INPUT -i virbr0 -p tcp -m tcp --dport 67 -j ACCEPT
-A FORWARD -d 192.168.122.0/24 -o virbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -s 192.168.122.0/24 -i virbr0 -j ACCEPT
-A FORWARD -i virbr0 -o virbr0 -j ACCEPT
-A FORWARD -o virbr0 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -i virbr0 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -m physdev --physdev-is-bridged -j opennebula
-A OUTPUT -o virbr0 -p udp -m udp --dport 68 -j ACCEPT
-A one-44-0-i -m state --state RELATED,ESTABLISHED -j ACCEPT
-A one-44-0-i -j RETURN
-A one-44-0-i -j DROP
-A one-44-0-o -m state --state RELATED,ESTABLISHED -j ACCEPT
-A one-44-0-o -j RETURN
-A one-44-0-o -j DROP
-A one-44-1-i -m state --state RELATED,ESTABLISHED -j ACCEPT
-A one-44-1-i -j RETURN
-A one-44-1-i -j DROP
-A one-44-1-o -m state --state RELATED,ESTABLISHED -j ACCEPT
-A one-44-1-o -j RETURN
-A one-44-1-o -j DROP
-A one-53-0-i -m state --state RELATED,ESTABLISHED -j ACCEPT
-A one-53-0-i -j RETURN
-A one-53-0-i -j DROP
-A one-53-0-o -m state --state RELATED,ESTABLISHED -j ACCEPT
-A one-53-0-o -j RETURN
-A one-53-0-o -j DROP
-A one-70-1-i -m state --state RELATED,ESTABLISHED -j ACCEPT
-A one-70-1-i -j RETURN
-A one-70-1-i -j DROP
-A one-70-1-o -m state --state RELATED,ESTABLISHED -j ACCEPT
-A one-70-1-o -j RETURN
-A one-70-1-o -j DROP
-A opennebula -m physdev --physdev-in one-70-1 --physdev-is-bridged -j one-70-1-o
-A opennebula -m physdev --physdev-out one-70-1 --physdev-is-bridged -j one-70-1-i
-A opennebula -m physdev --physdev-in one-44-1 --physdev-is-bridged -j one-44-1-o
-A opennebula -m physdev --physdev-out one-44-1 --physdev-is-bridged -j one-44-1-i
-A opennebula -m physdev --physdev-in one-44-0 --physdev-is-bridged -j one-44-0-o
-A opennebula -m physdev --physdev-out one-44-0 --physdev-is-bridged -j one-44-0-i
-A opennebula -m physdev --physdev-in one-53-0 --physdev-is-bridged -j one-53-0-o
-A opennebula -m physdev --physdev-out one-53-0 --physdev-is-bridged -j one-53-0-i
-A opennebula -j ACCEPT
COMMIT

Capturing on the subinterface. Only receives paquets when pinging from the outside:

PCAP captured shows that ARP requests from the outside are reaching the node:

When running a tcpdump and doing nothing from the outside 0 paquets are captured.

Clearly, VM deployed from Opennebula Market are not asking for IP (not sending DHCP discovery or any other). But, if I change the Virtual Network from “Ethernet” to “IPv4”, they get IP and the connectivity is fine, all working in that scenario.

As I have a mix of KVM and VMWare nodes, I wanted my DHCP server to be an independent server, bit I’m assuming that for any reason (bug, maybe?) this will not be possible.

So, stupid question here: can I use Opennebula Virtual Networks (802.11, OVS, or whatever capable of taging) to serve as DHCp server for both Hypervisors?
Answer: I think is not possible, but I just want to be sure.

Thanks again.