SOLVED Live Migration "failed: error: Cannot access storage file" on 5.2.1

Hi,

i am facing some issues with the Live Migration i am just getting the following logs.

Thu Feb 9 14:53:27 2017 [Z0][VM][I]: New LCM state is MIGRATE
Thu Feb 9 14:53:28 2017 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_premigrate.
Thu Feb 9 14:53:28 2017 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Thu Feb 9 14:53:28 2017 [Z0][VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vmm/kvm/migrate ‘one-46’ ‘compute01’ ‘compute02’ 46 compute02
Thu Feb 9 14:53:28 2017 [Z0][VMM][E]: migrate: Command “virsh --connect qemu:///system migrate --live one-46 qemu+ssh://compute01/system” failed: error: Cannot access storage file ‘/var/lib/one//datastores/0/46/disk.1’ (as uid:9869, gid:9869): No such file or directory
Thu Feb 9 14:53:28 2017 [Z0][VMM][E]: Could not migrate one-46 to compute01
Thu Feb 9 14:53:28 2017 [Z0][VMM][I]: ExitCode: 1
Thu Feb 9 14:53:28 2017 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_failmigrate.
Thu Feb 9 14:53:28 2017 [Z0][VMM][I]: Failed to execute virtualization driver operation: migrate.
Thu Feb 9 14:53:28 2017 [Z0][VMM][E]: Error live migrating VM: Could not migrate one-46 to compute01
Thu Feb 9 14:53:28 2017 [Z0][VM][I]: New LCM state is RUNNING
Thu Feb 9 14:53:28 2017 [Z0][LCM][I]: Fail to live migrate VM. Assuming that the VM is still RUNNING (will poll VM).

I am not really sure why it isn’t able to get the file with the oneadmin user (which has the uid 9869 as intended)

I am using the following structure.

  • All Hosts are using Debian 8.7.1
  • compute01 and compute02 are my Hostnodes which contain the vm’s
  • controller01 is containing the Sunstone Frontend
  • i can ssh passwordless with oneadmin from any host to any other host
  • the datastores (expept store 0) are being replicated with glusterfs on all compute nodes

I have also crawled through the forum but wasn’t able to find something which fits my point.

Each Feedback is much appreciated.

Thanks
Pecadis

Hi Pecadis,

Please provide some more info regarding your setup like OpenNebula version, datastores setup etc. At least the output from onedatastore list.

Kind Regards,
Anton Todorov

Hi Anton,

the Opennebula Version is 5.2.1 as mentioned in the Subject.
All Servers are set up the same way at the same time (just a test environment)

Here you have the output of the onedatastore list

Thanks
Pecadis

Hi Pecadis,

Sorry, I’ve missed it.

I think you should make the system datastore (ID:0) as shared as the ssh tm-mad doesn’t support live migration.

Kind Regards,
Anton Todorov

Hi Anton,

just to be clear, should the System datastore (ID:0) be replicated through all Compute nodes?

Thank You
Pecadis

Hi Pecadis,

Yes. And only between the Compute nodes. For further info you should take a look at File system Datastore chapter of the documentation.

Kind Regards,
Anton Todorov

Hi Anton,

thank you a lot. Unfortunately the Documentation for the Live Migration wasn’t that clear as for the other topics.

I have set up the System Datastore as shared and it works pretty good.

I think we can close this Case =)

Best
Pecadis

You are right @Pecadis…We would very gladly accept a PR to fix this in the documentation :slight_smile:

EDIT: I just created this: https://dev.opennebula.org/issues/5020