Hi all, I’m trying to setup a cluster with 2 nodes: 32GB, 2TB dedicated HDD. Node 1 its running also the frontend. The purpose of the cluster its to deploy a Hadoop cluster for a degree final project.
My intention its to use the 2TB disk on each of the nodes to run the VM’s. The only way I’ve found it in the documentation to use this local store, its setting up a FileSystem/SSH datastore.
Cluster Configuration:
oneadmin@kcldsrv1:~$ onecluster show 100
CLUSTER 100 INFORMATION
ID : 100
NAME : KataClust
CLUSTER TEMPLATE
DATASTORE_LOCATION="/onestore/"
HOSTS
0
1
DATASTORES
1
2
100
101
oneadmin@kcldsrv1:/var/log/one$ onedatastore list
ID NAME SIZE AVAIL CLUSTER IMAGES TYPE DS TM STAT
0 system 196.7G 94% - 0 sys - shared off
1 default 196.7G 94% KataClust 2 img fs shared on
2 files 196.7G 94% KataClust 0 fil fs ssh on
100 local_system - - KataClust 0 sys - ssh on
101 local_image 393.6G 95% KataClust 1 img fs ssh on
oneadmin@kcldsrv1:/var/log/one$ onedatastore show 100
DATASTORE 100 INFORMATION
ID : 100
NAME : local_system
USER : oneadmin
GROUP : oneadmin
CLUSTER : KataClust
TYPE : SYSTEM
DS_MAD : -
TM_MAD : ssh
BASE PATH : /onestore/100
DISK_TYPE : FILE
STATE : READY
DATASTORE CAPACITY
TOTAL: : -
FREE: : -
USED: : -
LIMIT: : -
PERMISSIONS
OWNER : um-
GROUP : u--
OTHER : ---
DATASTORE TEMPLATE
BASE_PATH="/onestore/"
BRIDGE_LIST="kcldsrv1.cutresoft.bog kcldsrv2.cutresoft.bog"
SHARED="NO"
TM_MAD="ssh"
TYPE="SYSTEM_DS"
IMAGES
datastore 1 its for images, (shared) Basepath: /var/lib/one/datastores
datastore 101 its to mount big data file systems for Hadoop HDFS (ssh). Basepath: /onestore
datastore 100 its for running vm’s located in the hosts dir /onestore (ssh) Basepath: /onestore
Ok, I have imported a marketplace template with the resulting OS images on DS 1. With that configuration, the vm’s can not be copied from DS_IMAGE 1 (default) to DS_SYSTEM 100, as you can see on the following abstract from the log file:
It starts trying to copy from the right src to the right destination through the right tm shared as it is the DS 1, I mean, a shared DS.
Wed Dec 30 18:22:35 2015 [Z0][TM][I]: Command execution fail: /var/lib/one/remotes/tm/shared/clone kcldsrv1.cutresoft.bog:/var/lib/one//datastores/1/2fcffd24b5cc9fc7d3dda2d360d1d28c kcldsrv1.cutresoft.bog:/onestore/100/7/disk.0 7 1
Later, the script interchange src and dst trying to copy a non existent file:
Wed Dec 30 18:22:35 2015 [Z0][TM][I]: clone: Cloning /onestore/1/2fcffd24b5cc9fc7d3dda2d360d1d28c in kcldsrv1.cutresoft.bog:/onestore/100/7/disk.0
Wed Dec 30 18:22:35 2015 [Z0][TM][E]: clone: Command "cd /onestore/100/7; cp /onestore/1/2fcffd24b5cc9fc7d3dda2d360d1d28c /onestore/100/7/disk.0 " failed: Warning: Permanently added 'kcldsrv1.cutresoft.bog,192.168.1.135' (ECDSA) to the list of known hosts.
Wed Dec 30 18:22:35 2015 [Z0][TM][I]: cp: cannot stat '/onestore/1/2fcffd24b5cc9fc7d3dda2d360d1d28c': No such file or directory
I have modified /var/lib/one/remotes/tm/shared/clone to fix the wrong src file.
From:
SRC_PATH="${DST_DS_PATH}${SRC_ARG_PATH##$SRC_DS_PATH}"
To:
SRC_PATH=$SRC_ARG_PATH
IMO the shared driver its not taking into account the type of the destination DS. With that fix, my VM’s are being created properly.
I’m pretty new to OpenNebula so I suppose I must be wrong but the thing is, that the fix solves the problem. Could this be a bug in the tm shared code?
Regards.