How to use a second datastore

hdlbq · June 7, 2019, 3:03pm

Hello,
I’m a newbie on openNebula.
I’m looking for a solution to a problem I considered as rather common and trivial.
What kind of datastore (and what kind of disk) should I use to be able to use more than one datastore ?
I apologize if my question is trivial, but I tried the following things without success:

second datastore of type “image”, with the boot disk on it: failed with message “Error deploying virtual machine: Could not create domain from …”
second datastore of type “file”: what kind of file to use ? (Kernel, RamDisk, Context), how to specify a size for a blank disk ?

Since I need that just for one specific VM, I’m ready to just attach the disk seen by opennebula (a NFS volume) to that VM.
Thanks for any hint.
Henri

cgonzalez · June 7, 2019, 3:17pm

Hello @hdlbq,

I don’t understand what’s your problem. Maybe you have misunderstood the datastore concept, you can take a look at the documentation for clarifying the concept http://docs.opennebula.org/5.8/deployment/open_cloud_storage_setup/overview.html.

If you already have read it, maybe you can provide more details.

hdlbq · June 7, 2019, 3:33pm

Hi,
thanks for your answer.
I’m may be wrong using the “datastore” word. I just want to be able to use a huge separate disk for storing the data a specific VM uses.
If I define a second ‘image’ datastore, that uses a second NFS-mounted disk, while keeping the ‘system’ datastore into the first disk (located on a first NFS-mounted disk), I have the following error:
“Error executing image transfer script: Error copying NEBULA_HOST_NAME:/var/lib/one/NEW_DATASTORE/ to NFS_SERVER_NAME:/var/lib/one//datastores/0/321/disk.0”

atodorov_storpool · June 7, 2019, 8:25pm

Hi,

The current implementation of the “shared” TM_MADs always copy the images to the SYSTEM datastore.
I thought it is a bug but it appear that it is by design, though.

If you set the image as “Persistent” it will be symlinked instead of copied.

Hope this helps.

An permanent solution is to fork the TM_MAD as separate addon with fixed drivers to clone in the datastore and do symlinks in the system datastore for the non-persistent images too. I could create PoC of such addon if I find some free time in the weekend.

Best Regards,
Anton Todorov

atodorov_storpool · June 8, 2019, 9:37pm

Hi @hdlbq

Please find attached a patch that keep the shared files in the shared image datastore:
shared_self.patch (7.0 KB)

To try it you should define and enable in the configuration a new TM_MAD driver as follow:

First clone tm/shared as tm/shared_self and patch the “new” driver:

# copy tm/shared as tm/shared_self
su - oneadmin -c 'cp -a remotes/tm/shared{,_self}'
# apply the patch (asuming the patch is in _/tmp/shared_self.patch_)
cd ~oneadmin/remotes/tm
patch -p0 < /tmp/shared_self.patch

Then you need to tell OpenNebula for the new TM_MAD editing /etc/one/oned.conf

sed -i.back_self 's|shared,|shared,shared_self,|g' /etc/one/oned.conf

cat >>/etc/one/oned.conf <<EOF
TM_MAD_CONF = [
    NAME = "shared_self", LN_TARGET = "NONE", CLONE_TARGET = "SELF", SHARED = "YES",
    DS_MIGRATE = "YES", TM_MAD_SYSTEM = "ssh", LN_TARGET_SSH = "NONE",
    CLONE_TARGET_SSH = "SELF", DISK_TYPE_SSH = "FILE"
]
EOF

After restarting the opennebula service

 systemctl restart opennebula

Change the TM_MAD for the IMAGE datastore(s) from shared to shared_self. (just edit the datastore attribute via Sunstone)

Hope this helps in your use case.

Best Regards,
Anton Todorov

hdlbq · June 11, 2019, 11:50am

Hi Anton,
thanks a lot for your help.
I missed probably something in the configuration I defined.
I followed step-by-step your patching process, without error.
I created a 1TB disk, on the second image datastore (the one with TM_MAD=shared_self).
I tried to attach this disk to the VM template I need to instantiare (boot disk on the standard datastore).
It failed with this message:
“Error deploying virtual machine: Could not create domain from /var/lib/one//datastores/0/324/deployment.0”
Please find under the shared datastore definition (screen copy), and the template of the VM.
Thanks
Best regards
Henri Delebecque

CONTEXT = [
NETWORK = “YES”,
SSH_PUBLIC_KEY = “” ]
CPU = “1”
DISK = [
IMAGE_ID = “285” ]
DISK = [
IMAGE = “notebook”,
IMAGE_UNAME = “hdlbq” ]
GRAPHICS = [
LISTEN = “0.0.0.0”,
TYPE = “VNC” ]
INPUTS_ORDER = “”
LOGO = “images/logos/ubuntu.png”
MEMORY = “2048”
MEMORY_UNIT_COST = “MB”
NIC = [
NETWORK = “” ]
OS = [
ARCH = “x86_64”,
BOOT = “” ]
SAVED_TEMPLATE_ID = “277”
VCPU = “1”

atodorov_storpool · June 11, 2019, 12:01pm

looking ad the images it should work.

Please check on the shared filesystem that you have an image created with name like md5sum hash, then if the image is not flagged as persistent the should be a second file with name $VM_ID-$DISK_ID and on the hypervisor you should have a symlink in the VM’s home - usually it is /var/lib/one/datastores/$SYSTEM_DS_ID/$VM_ID/

Also the output of onevm show $VM_ID -x would be helpful to pin-point where is the issue.

Best Regards,
Anton Todorov

hdlbq · June 11, 2019, 12:18pm

Hi,
the shared file system volume contains only the 1TB file I created (d0b112686ba0bab8fc16f48c19faa98f).
On the opennebula’s node (I have a specific VM for it), the /var/lib/one/datastore/0/$VM_ID/ directory contains:
“-rw-r–r-- 1 oneadmin oneadmin 1068 Jun 11 13:35 deployment.0
lrwxrwxrwx 1 oneadmin oneadmin 31 Jun 11 13:35 disk.0 -> /var/lib/one/datastores/1/324-0
lrwxrwxrwx 1 oneadmin oneadmin 60 Jun 11 13:35 disk.1 -> /var/lib/one/datastores/116/d0b112686ba0bab8fc16f48c19faa98f
-rw-r–r-- 1 oneadmin oneadmin 372736 Jun 11 13:35 disk.2”
datastores 0,1 and 2 are located on the same first NFS-mounted volume, while datastore 116 is on the other nfs-mounted volume.
I join the result of the onevm show command

Thanks
Best regards
Henri Delebecque
show324.txt (4.9 KB)

atodorov_storpool · June 11, 2019, 12:59pm

It looks like disk.0 is created with shared_self because it’s name is $VM_ID-$DISK_ID but the VM’s XML you posted the TM_MAD is still shared for disk 0, which is weird… Could you try to change all shared datastores to use shared_self?
Also take a look at the VM logs - there should be some more lines before/after Error deploying virtual machine … with information what exactly libvirt doesn’t like…
and try instantiating a new VM.

atodorov_storpool · June 11, 2019, 1:15pm

Another note - it is strange that OpenNebula llowed to start in this configuration, because the shared TM_MAD is not white-listed in shared_self

To white-list you should alter the TM_MAD configuration in /etc/one/oned.conf to whitelist both ssh and shared SYSTEM TM_MADs:

TM_MAD_CONF = [
    NAME = "shared_self", LN_TARGET = "NONE", CLONE_TARGET = "SELF", SHARED = "YES",
    DS_MIGRATE = "YES", TM_MAD_SYSTEM = "ssh,shared",
    LN_TARGET_SSH = "NONE", CLONE_TARGET_SSH = "SELF", DISK_TYPE_SSH = "FILE",
    LN_TARGET_SHARED = "NONE", CLONE_TARGET_SHARED = "SELF", DISK_TYPE_SHARED = "FILE"
]

then restart opennebula.service. At last you should update the Datastore template too. Editing TM_MAD_SYSTEM ssh to ssh,shared should be enough for opennebula to refresh/populate the other fields too…

atodorov_storpool · June 11, 2019, 1:23pm

Here is my testing VM with similar configuration:

<DISK>
  <ALLOW_ORPHANS><![CDATA[NO]]></ALLOW_ORPHANS>
  <CLONE><![CDATA[YES]]></CLONE>
  <CLONE_TARGET><![CDATA[SELF]]></CLONE_TARGET>
  <CLUSTER_ID><![CDATA[100]]></CLUSTER_ID>
  <DATASTORE><![CDATA[NFS-IMAGE]]></DATASTORE>
  <DATASTORE_ID><![CDATA[103]]></DATASTORE_ID>
  <DEV_PREFIX><![CDATA[vd]]></DEV_PREFIX>
  <DISK_ID><![CDATA[0]]></DISK_ID>
  <DISK_SNAPSHOT_TOTAL_SIZE><![CDATA[0]]></DISK_SNAPSHOT_TOTAL_SIZE>
  <DISK_TYPE><![CDATA[FILE]]></DISK_TYPE>
  <DRIVER><![CDATA[qcow2]]></DRIVER>
  <IMAGE><![CDATA[Debian Testing - KVM NFS+PERSISTENT-disk-0]]></IMAGE>
  <IMAGE_ID><![CDATA[21]]></IMAGE_ID>
  <IMAGE_STATE><![CDATA[2]]></IMAGE_STATE>
  <LN_TARGET><![CDATA[NONE]]></LN_TARGET>
  <ORIGINAL_SIZE><![CDATA[2048]]></ORIGINAL_SIZE>
  <READONLY><![CDATA[NO]]></READONLY>
  <SAVE><![CDATA[NO]]></SAVE>
  <SIZE><![CDATA[3072]]></SIZE>
  <SOURCE><![CDATA[/var/lib/one//datastores/103/703a0d932b2f2f7972f4e93787be6d63]]></SOURCE>
  <TARGET><![CDATA[vda]]></TARGET>
  <TM_MAD><![CDATA[shared_self]]></TM_MAD>
  <TM_MAD_SYSTEM><![CDATA[shared]]></TM_MAD_SYSTEM>
  <TYPE><![CDATA[FILE]]></TYPE>
</DISK>
<DISK>
  <ALLOW_ORPHANS><![CDATA[NO]]></ALLOW_ORPHANS>
  <CLONE><![CDATA[NO]]></CLONE>
  <CLONE_TARGET><![CDATA[SELF]]></CLONE_TARGET>
  <CLUSTER_ID><![CDATA[100]]></CLUSTER_ID>
  <DATASTORE><![CDATA[NFS-IMAGE]]></DATASTORE>
  <DATASTORE_ID><![CDATA[103]]></DATASTORE_ID>
  <DEV_PREFIX><![CDATA[sd]]></DEV_PREFIX>
  <DISK_ID><![CDATA[1]]></DISK_ID>
  <DISK_SNAPSHOT_TOTAL_SIZE><![CDATA[0]]></DISK_SNAPSHOT_TOTAL_SIZE>
  <DISK_TYPE><![CDATA[FILE]]></DISK_TYPE>
  <DRIVER><![CDATA[raw]]></DRIVER>
  <IMAGE><![CDATA[persistentNFS]]></IMAGE>
  <IMAGE_ID><![CDATA[20]]></IMAGE_ID>
  <IMAGE_STATE><![CDATA[8]]></IMAGE_STATE>
  <IMAGE_UNAME><![CDATA[oneadmin]]></IMAGE_UNAME>
  <LN_TARGET><![CDATA[NONE]]></LN_TARGET>
  <ORIGINAL_SIZE><![CDATA[3072]]></ORIGINAL_SIZE>
  <PERSISTENT><![CDATA[YES]]></PERSISTENT>
  <READONLY><![CDATA[NO]]></READONLY>
  <SAVE><![CDATA[YES]]></SAVE>
  <SIZE><![CDATA[3072]]></SIZE>
  <SOURCE><![CDATA[/var/lib/one//datastores/103/f1e74330c99dfa20653a5c64406c6e52]]></SOURCE>
  <TARGET><![CDATA[sda]]></TARGET>
  <TM_MAD><![CDATA[shared_self]]></TM_MAD>
  <TM_MAD_SYSTEM><![CDATA[shared]]></TM_MAD_SYSTEM>
  <TYPE><![CDATA[FILE]]></TYPE>
</DISK>

The VM’s home is:

[root@vs06 one]# ls -la ~oneadmin/datastores/102/44
drwxrwsr-x 2 oneadmin oneadmin     68 Jun 11 16:09 .
drwxr-sr-x 3 oneadmin oneadmin     16 Jun 11 16:09 ..
-rw-rw-r-- 1 oneadmin oneadmin   2241 Jun 11 16:09 deployment.0
lrwxrwxrwx 1 oneadmin oneadmin     32 Jun 11 16:09 disk.0 -> /var/lib/one/datastores/103/44-0
lrwxrwxrwx 1 oneadmin oneadmin     60 Jun 11 16:09 disk.1 -> /var/lib/one/datastores/103/f1e74330c99dfa20653a5c64406c6e52
-rw-r--r-- 1 oneadmin oneadmin 372736 Jun 11 16:09 disk.2

[root@vs06 one]# ls -la ~oneadmin/datastores
total 0
drwxrwxr-x 3 oneadmin oneadmin  39 Jun  8 14:57 .
drwxr-xr-x 7 oneadmin oneadmin 172 Jun 10 16:30 ..
drwxrwxr-x 2 oneadmin oneadmin  22 Jun 11 11:02 100
lrwxrwxrwx 1 oneadmin oneadmin  22 Jun  8 14:57 102 -> ../../../../shared/102
lrwxrwxrwx 1 oneadmin oneadmin  22 Jun  8 14:57 103 -> ../../../../shared/103

root@vs06 one]# ls -la /shared
total 0
drwxr-sr-x   4 nfsnobody nfsnobody  28 Jun  8 14:54 .
drwxrwsr-x. 18 root      root      238 Jun  8 14:49 ..
drwxr-sr-x   3 oneadmin  oneadmin   16 Jun 11 16:09 102
drwxr-sr-x   2 oneadmin  oneadmin  150 Jun 11 16:09 103

Hope this helps,

Best Regards,
Anton

hdlbq · June 11, 2019, 1:32pm

Hi,
even after modifying the /etc/one/oned.conf as described, the deploy failed with these messages in the log of the VM, where 9869 is the uid/guid of oneadmin.

One strange thing: this line in the log:
“Cannot access storage file ‘/var/lib/one//datastores/0/326/disk.1’ (as uid:9869, gid:9869)” is followed by “no such file or directory (in french).” But the file exists as a symlink:

ls -al /var/lib/one/datastores/0/327/disk.1
lrwxrwxrwx 1 oneadmin oneadmin 60 Jun 11 16:00 /var/lib/one/datastores/0/327/disk.1 → /var/lib/one/datastores/116/d450d37defb8b4e350499dd2d13cb5b5

and the symlink target exists too:

ls -alh /var/lib/one/datastores/116
total 8.0K
drwxr-xr-x 2 oneadmin oneadmin 4.0K Jun 11 15:59 .
drwxr-xr-x 7 oneadmin oneadmin 4.0K Jun 7 17:46 …
-rw-r–r-- 1 oneadmin oneadmin 10G Jun 11 15:58 bc21230b51137d38a068834174c3ec42
-rw-r–r-- 1 oneadmin oneadmin 10G Jun 11 15:59 d450d37defb8b4e350499dd2d13cb5b5

The main difference between this case, and a “good” one is that the datastore 116 is located on a volume that is not the same as disk.0
Thanks
Best regards,
H. Delebecque

Tue Jun 11 15:22:13 2019 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/326/deployment.0
Tue Jun 11 15:22:13 2019 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Tue Jun 11 15:22:13 2019 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Tue Jun 11 15:22:13 2019 [Z0][VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vmm/kvm/deploy ‘/var/lib/one//datastores/0/326/deployment.0’ ‘HOST_NAME’ 326 HOST_NAME
Tue Jun 11 15:22:13 2019 [Z0][VMM][I]: error: Failed to create domain from /var/lib/one//datastores/0/326/deployment.0
Tue Jun 11 15:22:13 2019 [Z0][VMM][I]: error: Cannot access storage file ‘/var/lib/one//datastores/0/326/disk.1’ (as uid:9869, gid:9869): Aucun fichier ou dossier de ce type
Tue Jun 11 15:22:13 2019 [Z0][VMM][E]: Could not create domain from /var/lib/one//datastores/0/326/deployment.0
Tue Jun 11 15:22:13 2019 [Z0][VMM][I]: ExitCode: 255
Tue Jun 11 15:22:13 2019 [Z0][VMM][I]: Failed to execute virtualization driver operation: deploy.
Tue Jun 11 15:22:13 2019 [Z0][VMM][E]: Error deploying virtual machine: Could not create domain from /var/lib/one//datastores/0/326/deployment.0

atodorov_storpool · June 11, 2019, 2:47pm

Hmmmm could you check with which user and group libvirtd is running(it should be running as oneadmin), also SEliniux, apparmor etc…

It looks like the qemu-kvm could not access the file. Sometimes it could be related to kernel namespaces too…

Try adding the following to /etc/libvirt/qemu.conf (and restart libvirtd after that)

namespaces = [ ]

Also check the kernel logs for some indication for enforced restrictions…

Best Regards,
Anton Todorov

atodorov_storpool · June 12, 2019, 7:20pm

Hi @hdlbq,

Do you have any progress?

Best Regards,
Anton Todorov

hdlbq · June 13, 2019, 9:20am

Hi,
thanks for the time you spent helping me.
My architecture is not homogenous:

the hosts managed by openNebula are KVM-based, and hosted by a small dedicated cluster.
the NFS server that owns the two datastore volumes is hosted by one of these nodes.
the opennebula server is an esx host, located in a production cluster.

The NFS server is installed directly on a physical node.
The owner of the libvirtd process is root. But the physical node OS does not define any oneadmin user
Best regards
H. Delebecque

hdlbq · June 13, 2019, 9:53am

Hi,
I tried adding the namespace directive to the physical nodes:

that hosts the NFS server,
that hosts the opennebula VM that access to the two disks (boot and data disk)

without success.
I have’nt found anything related to my problems in the system log of the nodes concerned (the NFS server, the openNebula Host, or the host running the faulty VM).
Best regards
H. Delebecque

hdlbq · June 13, 2019, 12:38pm

Hi,
I solved the problem by defining a new system datastore located on the same volume as the datastore containing the data disks.
Thanks a lot
Best regards
H. Delebecque

Topic		Replies	Views
[SOLVED] Error copying image via ssh Product Support	7	5609	January 29, 2017
NFS Datastore Only for VM Storage Product Support	10	3235	January 22, 2018
Unable to instanciate VM on a fresh installed system Product Support	11	2604	September 20, 2017
Shared LVM datastore Product Support	13	3025	March 26, 2017
How can I deploy disk images of the same VM to different system datastores? Product Support	4	1055	August 30, 2016

How to use a second datastore

Related topics