i am using Debian 9, opennebula 5.4.6 and drbdadm 9.2.0 with drbd driver from github. everything works correctly: i can download images and start VMs on the drbd datastore. But when i try to migrate (live or non-live). it fails. here are the output:
onedatastore show 107:
DATASTORE 107 INFORMATION
ID : 107
NAME : drbdmanage_redundant
USER : oneadmin
GROUP : oneadmin
CLUSTERS : 0,100
TYPE : IMAGE
DS_MAD : drbdmanage
TM_MAD : drbdmanage
BASE PATH : /var/lib/one//datastores/107
DISK_TYPE : FILE
STATE : READY
DATASTORE CAPACITY
TOTAL: : 3.8T
FREE: : 3.7T
USED: : 0M
LIMIT: : -
PERMISSIONS
OWNER : um-
GROUP : u--
OTHER : ---
DATASTORE TEMPLATE
ALLOW_ORPHANS="NO"
BRIDGE_LIST="virt1 virt2"
CLONE_TARGET="SELF"
DISK_TYPE="FILE"
DRBD_REDUNDANCY="2"
DRBD_SUPPORT_LIVE_MIGRATION="yes"
DS_MAD="drbdmanage"
LN_TARGET="NONE"
RESTRICTED_DIRS="/"
SAFE_DIRS="/var/tmp"
TM_MAD="drbdmanage"
IMAGES
32
log on live migrate:
Tue Jan 30 15:14:39 2018 [Z0][VM][I]: New LCM state is RUNNING
Tue Jan 30 15:19:25 2018 [Z0][VM][I]: New LCM state is MIGRATE
Tue Jan 30 15:19:25 2018 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_premigrate.
Tue Jan 30 15:19:25 2018 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Tue Jan 30 15:19:26 2018 [Z0][VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vmm/kvm/migrate 'one-38' 'virt1' 'virt2' 38 virt2
Tue Jan 30 15:19:26 2018 [Z0][VMM][E]: migrate: Command "virsh --connect qemu:///system migrate --live one-38 qemu+ssh://virt1/system" failed: error: Cannot access storage file '/var/lib/one//datastores/0/38/disk.1' (as uid:9869, gid:9869): No such file or directory
Tue Jan 30 15:19:26 2018 [Z0][VMM][E]: Could not migrate one-38 to virt1
Tue Jan 30 15:19:26 2018 [Z0][VMM][I]: ExitCode: 1
Tue Jan 30 15:19:26 2018 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_failmigrate.
Tue Jan 30 15:19:26 2018 [Z0][VMM][I]: Failed to execute virtualization driver operation: migrate.
Tue Jan 30 15:19:26 2018 [Z0][VMM][E]: Error live migrating VM: Could not migrate one-38 to virt1
Tue Jan 30 15:19:26 2018 [Z0][VM][I]: New LCM state is RUNNING
Tue Jan 30 15:19:26 2018 [Z0][LCM][I]: Fail to live migrate VM. Assuming that the VM is still RUNNING (will poll VM).
log on migrate:
Tue Jan 30 15:21:10 2018 [Z0][VM][I]: New LCM state is SAVE_MIGRATE
Tue Jan 30 15:21:12 2018 [Z0][VMM][I]: /var/tmp/one/vmm/kvm/save: line 58: warning: command substitution: ignored null byte in input
Tue Jan 30 15:21:12 2018 [Z0][VMM][I]: ExitCode: 0
Tue Jan 30 15:21:12 2018 [Z0][VMM][I]: Successfully execute virtualization driver operation: save.
Tue Jan 30 15:21:12 2018 [Z0][VMM][I]: Successfully execute network driver operation: clean.
Tue Jan 30 15:21:12 2018 [Z0][VM][I]: New LCM state is PROLOG_MIGRATE
Tue Jan 30 15:21:23 2018 [Z0][VM][I]: New LCM state is BOOT_MIGRATE
Tue Jan 30 15:21:23 2018 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Tue Jan 30 15:21:23 2018 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Tue Jan 30 15:21:24 2018 [Z0][VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vmm/kvm/restore '/var/lib/one//datastores/0/38/checkpoint' 'virt1' 'one-38' 38 virt1
Tue Jan 30 15:21:24 2018 [Z0][VMM][I]: /var/tmp/one/vmm/kvm/restore: line 43: warning: command substitution: ignored null byte in input
Tue Jan 30 15:21:24 2018 [Z0][VMM][E]: restore: Command "virsh --connect qemu:///system restore /var/lib/one//datastores/0/38/checkpoint --xml /var/lib/one//datastores/0/38/checkpoint.xml" failed: error: Failed to restore domain from /var/lib/one//datastores/0/38/checkpoint
Tue Jan 30 15:21:24 2018 [Z0][VMM][I]: error: Cannot access storage file '/var/lib/one//datastores/0/38/disk.0' (as uid:9869, gid:9869): No such file or directory
Tue Jan 30 15:21:24 2018 [Z0][VMM][E]: Could not restore from /var/lib/one//datastores/0/38/checkpoint
Tue Jan 30 15:21:24 2018 [Z0][VMM][I]: ExitCode: 1
Tue Jan 30 15:21:24 2018 [Z0][VMM][I]: Failed to execute virtualization driver operation: restore.
Tue Jan 30 15:21:24 2018 [Z0][VMM][E]: Error restoring VM: Could not restore from /var/lib/one//datastores/0/38/checkpoint
Tue Jan 30 15:21:24 2018 [Z0][VM][I]: New LCM state is BOOT_MIGRATE_FAILURE
after live migrate, i inspect the destination server location “/var/lib/one/datastore/0/38” and it is not there
after migrate, i inspect the destination server, this time the folder is created “/var/lib/one/datastore/38” and it contains files including disk.1 :
-rw-r--r-- 1 oneadmin oneadmin 185040260 janv. 30 15:21 checkpoint
-rw-r--r-- 1 oneadmin oneadmin 2119 janv. 30 15:21 checkpoint.xml
-rw-r--r-- 1 oneadmin oneadmin 862 janv. 30 15:14 deployment.0
-rw-r--r-- 1 oneadmin oneadmin 372736 janv. 30 15:21 disk.1
but “disk.0” link to drbd device is not created.
can you provide any insight?
thank you in advance