I have searched for hours for many (many) different variations of the errors I am seeing in /var/log/one/oned.log. I have found many solutions pertaining to each issue I have read, but none seem to address my issue specifically.
Here is my setup:
- Control Node: Has sunstone, mysql, etc
- Compute Node: Has libvirt, opennebula-node-kvm
- Storage: GlusterFS 2 node cluster
(I moved away from an openstack installation and into opennebula, hence “compute” and “control”).
It was an uphill battle getting gluster to work. I ended up following a series of docs and forum posts to get mine to work “properly” (not entirely sure if it is or not yet). For the sake of anyone asking for my storage config, here they are:
ID NAME SIZE AVAIL CLUSTERS IMAGES TYPE DS TM STAT
110 system 5.5T 100% 0 0 sys - shared on
111 images 5.5T 100% 0 2 img fs shared on
112 files 5.5T 100% 0 0 fil fs shared on
DATASTORE 110 INFORMATION
ID : 110
NAME : system
USER : oneadmin
GROUP : oneadmin
CLUSTERS : 0
TYPE : SYSTEM
DS_MAD : -
TM_MAD : shared
BASE PATH : /var/lib/one//datastores/110
DISK_TYPE : FILE
STATE : READY
PERMISSIONS
OWNER : um-
GROUP : u--
OTHER : ---
DATASTORE TEMPLATE
DISK_TYPE="FILE"
DS_MIGRATE="YES"
RESTRICTED_DIRS="/"
SAFE_DIRS="/var/tmp"
SHARED="YES"
TM_MAD="shared"
TYPE="SYSTEM_DS"
DATASTORE 111 INFORMATION
ID : 111
NAME : images
USER : oneadmin
GROUP : oneadmin
CLUSTERS : 0
TYPE : IMAGE
DS_MAD : fs
TM_MAD : shared
BASE PATH : /var/lib/one//datastores/111
DISK_TYPE :
STATE : READY
PERMISSIONS
OWNER : um-
GROUP : u--
OTHER : ---
DATASTORE TEMPLATE
CLONE_TARGET="SYSTEM"
DISK_TYPE="GLUSTER"
DS_MAD="fs"
GLUSTER_HOST="gluster1:24007"
GLUSTER_VOLUME="volumename"
LN_TARGET="NONE"
RESTRICTED_DIRS="/"
SAFE_DIRS="/var/tmp"
TM_MAD="shared"
TYPE="IMAGE_DS"
DATASTORE 112 INFORMATION
ID : 112
NAME : files
USER : oneadmin
GROUP : oneadmin
CLUSTERS : 0
TYPE : FILE
DS_MAD : fs
TM_MAD : shared
BASE PATH : /var/lib/one//datastores/112
DISK_TYPE : FILE
STATE : READY
PERMISSIONS
OWNER : um-
GROUP : u--
OTHER : ---
DATASTORE TEMPLATE
CLONE_TARGET="SYSTEM"
DS_MAD="fs"
GLUSTER_HOST="gluster1:24007"
GLUSTER_VOLUME="volumename"
LN_TARGET="NONE"
RESTRICTED_DIRS="/"
SAFE_DIRS="/var/tmp"
TM_MAD="shared"
TYPE="FILE_DS"
So now we’ve got that out of the way, let’s take a look at my logs. This is an excerpt from a VM creation with a CentOS x86_64 template from OpenNebula “apps” area. I also tested with “ttylinux” and qcow2c image from CentOS repos.
Thu Jan 26 19:43:19 2017 [Z0][VMM][D]: Message received: LOG I 16 Successfully execute transfer manager driver operation: tm_context.
Thu Jan 26 19:43:19 2017 [Z0][VMM][D]: Message received: LOG I 16 Successfully execute network driver operation: pre.
Thu Jan 26 19:43:20 2017 [Z0][VMM][D]: Message received: LOG I 16 Command execution fail: cat << EOT | /var/tmp/one/vmm/kvm/deploy '/var/lib/one//datastores/110/16/deployment.0' 'compute1' 16 compute1
Thu Jan 26 19:43:20 2017 [Z0][VMM][D]: Message received: LOG I 16 error: Failed to create domain from /var/lib/one//datastores/110/16/deployment.0
Thu Jan 26 19:43:20 2017 [Z0][VMM][D]: Message received: LOG I 16 error: Cannot access storage file '/622fda14e7c896865adf8175b3ab1b01' (as uid:9869, gid:9869): No such file or directory
Thu Jan 26 19:43:20 2017 [Z0][VMM][D]: Message received: LOG E 16 Could not create domain from /var/lib/one//datastores/110/16/deployment.0
Thu Jan 26 19:43:20 2017 [Z0][VMM][D]: Message received: LOG I 16 ExitCode: 255
Thu Jan 26 19:43:20 2017 [Z0][VMM][D]: Message received: LOG I 16 Failed to execute virtualization driver operation: deploy.
Thu Jan 26 19:43:20 2017 [Z0][VMM][D]: Message received: DEPLOY FAILURE 16 Could not create domain from /var/lib/one//datastores/110/16/deployment.0
Where I believe the problem is occurring is with the images themselves (not a bad image, but an issue with one of the servers/protocols handling the image). The reason I say this is because I can create a “blank” VM (no image just an attached disk, gives “no bootable disk found” on VNC) and it deploys fine. I believe this to be enough proof that the compute (kvm) node is setup correctly. But of course, I could be wrong.
I have verified that oneadmin can reach the datastores, and has access to the datastores (read and write). I have torn down my gluster installation, removed all bricks/volumes, and recreated it. I have torn down my compute installation, removed libvirtd and opennebula, and reinstalled them.
Any ideas anyone? Please help