I’m having issues getting ceph datastores working properly on my installation, running 5.8.0-1 on CentOS 7 with a shared file system along with three new Ceph datastores. I am having issues with migrations between hosts with TM_MAD set to “ceph”.
Here are my configurations:
Ceph SSD Datastore (101):
DATASTORE TEMPLATE
ALLOW_ORPHANS="mixed"
BRIDGE_LIST="hypervisor01 hypervisor02"
CEPH_HOST="ceph01 ceph02 ceph03 ceph04 ceph05 ceph06"
CEPH_SECRET="SECRET_KEY"
CEPH_USER="SECRET_USER"
DISK_TYPE="RBD"
DS_MIGRATE="NO"
POOL_NAME="one_ssd"
RESTRICTED_DIRS="/"
SAFE_DIRS="/var/tmp"
SHARED="YES"
TM_MAD="ceph"
TM_MAD_SYSTEM="shared"
TYPE="SYSTEM_DS"
Ceph HDD Datastore (102):
DATASTORE TEMPLATE
ALLOW_ORPHANS="mixed"
BRIDGE_LIST="hypervisor01 hypervisor02"
CEPH_HOST="ceph01 ceph02 ceph03 ceph04 ceph05 ceph06"
CEPH_SECRET="SECRET_KEY"
CEPH_USER="SECRET_USER"
DISK_TYPE="RBD"
DS_MIGRATE="NO"
POOL_NAME="one_hdd"
RESTRICTED_DIRS="/"
SAFE_DIRS="/var/tmp"
SHARED="YES"
TM_MAD="ceph"
TM_MAD_SYSTEM="shared"
TYPE="SYSTEM_DS"
Ceph Image Datastore (103):
DATASTORE TEMPLATE
ALLOW_ORPHANS="mixed"
BRIDGE_LIST="hypervisor01 hypervisor02"
CEPH_HOST="ceph01 ceph02 ceph03 ceph04 ceph05 ceph06"
CEPH_SECRET="SECRET_KEY"
CEPH_USER="SECRET_USER"
CLONE_TARGET="SYSTEM"
CLONE_TARGET_SHARED="SYSTEM"
CLONE_TARGET_SSH="SYSTEM"
DISK_TYPE="RBD"
DISK_TYPE_SHARED="RBD"
DISK_TYPE_SSH="FILE"
DRIVER="raw"
DS_MAD="ceph"
LN_TARGET="SYSTEM"
LN_TARGET_SHARED="SYSTEM"
LN_TARGET_SSH="SYSTEM"
POOL_NAME="one_images"
RESTRICTED_DIRS="/"
SAFE_DIRS="/var/tmp"
TM_MAD="ceph"
TM_MAD_SYSTEM="shared"
TYPE="IMAGE_DS"
ONED LOG:
Tue Apr 9 14:55:12 2019 [Z0][VMM][I]: premigrate: Moving hypervisor01:/var/lib/one//datastores/102/87 to hypervisor02:/var/lib/one//datastores/102/87
Tue Apr 9 14:55:12 2019 [Z0][VMM][E]: premigrate: Command "set -e -o pipefail
Tue Apr 9 14:55:12 2019 [Z0][VMM][I]: tar -C /var/lib/one//datastores/102 --sparse -cf - 87 | ssh hypervisor02 'tar -C /var/lib/one//datastores/102 --sparse -xf -'" failed: tar: 87: Cannot stat: No such file or directory
Tue Apr 9 14:55:12 2019 [Z0][VMM][I]: tar: Exiting with failure status due to previous errors
Tue Apr 9 14:55:12 2019 [Z0][VMM][E]: Error copying disk directory to target host
Tue Apr 9 14:55:12 2019 [Z0][VMM][I]: Failed to execute transfer manager driver operation: tm_premigrate.
When I set the HDD (102) and SSD (101) configurations to TM_MAD “shared” it appears to work, what am I doing wrong? Is this how it should be setup?
The other issue I’m having is even with CLONE_TARGET_* and LN_TARGET_* set to “SYSTEM” it doesn’t appear to be copying to the system datastores:
$ rbd ls -p one_images --id libvirt
one-11
one-11-87-0
$ rbd ls -p one_ssd --id libvirt
<empty>
$ rbd ls -p one_hdd --id libvirt
<empty>
Would appreciate any help, I’ve poured over documentation and must be missing something. It’s worth noting we also have a shared file system at /var/lib/one/datastores/.