Unable to migrate VM created on specific Host

Hello,

I have 3 hosts, one also acting as nfs datastore and hosting the frontend software.
I am using OpenNebula 4.6.2, kvm hypervisors under Centos 6.5.
I can create a vm on any host on any datastore I choose. Datastore 100 is the one we use as SSH local datastore.

If I create a vm on host 0 or 1, I can migrate the vm to any host.

But if I create a vm on host 2, files get transfered OK but migration fails and I have to delete the vm :smile:

Thu Jun 25 10:19:11 2015 [LCM][I]: New VM state is SAVE_MIGRATE
Thu Jun 25 10:19:13 2015 [VMM][I]: ExitCode: 0
Thu Jun 25 10:19:13 2015 [VMM][I]: Successfully execute virtualization driver operation: save.
Thu Jun 25 10:19:13 2015 [VMM][I]: ExitCode: 0
Thu Jun 25 10:19:13 2015 [VMM][I]: Successfully execute network driver operation: clean.
Thu Jun 25 10:19:13 2015 [LCM][I]: New VM state is PROLOG_MIGRATE
Thu Jun 25 10:19:28 2015 [LCM][I]: New VM state is BOOT
Thu Jun 25 10:19:28 2015 [VMM][I]: ExitCode: 0
Thu Jun 25 10:19:28 2015 [VMM][I]: Successfully execute network driver operation: pre.
Thu Jun 25 10:19:28 2015 [VMM][I]: Command execution fail: /var/tmp/one/vmm/kvm/restore ‘/var/lib/one//datastores/100/86/checkpoint’ ‘destination-host.example.com’ ‘one-86’ 86 destination-host.example.com
Thu Jun 25 10:19:28 2015 [VMM][E]: restore: Command “virsh --connect qemu:///system restore /var/lib/one//datastores/100/86/checkpoint” failed: error: Failed to restore domain from /var/lib/one//datastores/100/86/checkpoint
Thu Jun 25 10:19:28 2015 [VMM][I]: error: internal error process exited while connecting to monitor: Supported machines are:
Thu Jun 25 10:19:28 2015 [VMM][I]: pc RHEL 6.5.0 PC (alias of rhel6.5.0)
Thu Jun 25 10:19:28 2015 [VMM][I]: rhel6.5.0 RHEL 6.5.0 PC (default)
Thu Jun 25 10:19:28 2015 [VMM][I]: rhel6.4.0 RHEL 6.4.0 PC
Thu Jun 25 10:19:28 2015 [VMM][I]: rhel6.3.0 RHEL 6.3.0 PC
Thu Jun 25 10:19:28 2015 [VMM][I]: rhel6.2.0 RHEL 6.2.0 PC
Thu Jun 25 10:19:28 2015 [VMM][I]: rhel6.1.0 RHEL 6.1.0 PC
Thu Jun 25 10:19:28 2015 [VMM][I]: rhel6.0.0 RHEL 6.0.0 PC
Thu Jun 25 10:19:28 2015 [VMM][I]: rhel5.5.0 RHEL 5.5.0 PC
Thu Jun 25 10:19:28 2015 [VMM][I]: rhel5.4.4 RHEL 5.4.4 PC
Thu Jun 25 10:19:28 2015 [VMM][I]: rhel5.4.0 RHEL 5.4.0 PC
Thu Jun 25 10:19:28 2015 [VMM][E]: Could not restore from /var/lib/one//datastores/100/86/checkpoint
Thu Jun 25 10:19:28 2015 [VMM][I]: ExitCode: 1
Thu Jun 25 10:19:28 2015 [VMM][I]: Failed to execute virtualization driver operation: restore.
Thu Jun 25 10:19:28 2015 [VMM][E]: Error restoring VM: Could not restore from /var/lib/one//datastores/100/86/checkpoint
Thu Jun 25 10:19:28 2015 [DiM][I]: New VM state is FAILED

I’ve checked everything I am aware of checking (as per http://docs.opennebula.org/4.12/administration/virtualization/kvmg.html)
The thing is that host 0 and 1 didn’t have libvirt listen_tcp = 1 or --listen and migration was working anyway.
I cleared all firewall rules, also.

Any pointer, log to check or test to perform to put me in the right direction to solve this issue?

Thanks,
regards,
Sergi

This problem is usually because yo have different hosts CPU’s/arch’s or
libvirt installations

Hello Ruben,

as far as I can check both hosts are identical hardware and libvirt and libvirt versions nearly the same (0.10.2-29 and 0.10.2-46).
As I said, if the vm is created on host 0 or 1, it can be migrated to/from any host (0, 1 or 2), but it can’t be migrated if created on host 2.

Well,

it seems that vm’s that started up on host 2 are stuck on host 2, and a migration to another host means ‘deleting it’.

I tried to migrate a vm from host 2 to host 1, a vm originally created on host 0 a long time ago, and I have lost this vm.

I did copy the files, though 
 is there any procedure to recover from a failed state, or taking this vm to host 2 again?

Well, out of desperation I tried

virsh create deployment.4 (which is the last deployment file there is)

and got :

virsh create deployment.4
error: Failed to create domain from deployment.4
error: Unable to create tap device vnet%d: Operation not permitted

on both host 1 (where I wanted the vm to migrate) and host 2 (where the vm came from)

Is there any other way?

Thanks,
Sergi

ssesubs forum@opennebula.org writes:

Well,

Hello,

it seems that vm’s that started up on host 2 are stuck on host 2, and a migration to another host means ‘deleting it’.

Does host2 can connect on other hosts with paswordless key?

oneadmin@host2:~$ ssh host1

If I remember correctly, live migration are tunnelled with SSH.

Regards.

Daniel Dehennin
Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF
Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF
signature.asc (342 Bytes)

Hello Daniel,

i missed the notification about your reply.

Yes, any host can connect to any host passwordless.
I found that the problem is that host2 is a centOS 6.6 server while host1 is a centOS 6.5 server.
It seems that libvirt doesn’t allow migrating from a 6.6 to a 6.5 
 I asked the forum about a clue to override this, but no response yet.

Regards,