Odd datastore usage: "[TemplateInstantiate] Failed to clone images: Not enough space in datastore" on shared system datastore for (remote) ssh datastore

I have (NFS backed) datastores 0 to 2 as shared, mounted on each hypervisor to /var/lib/one/datastores/?, as well as datastores 103 (image) and 102 (system) of type ssh. I want to have most VMs easily relocatable (hence shared storage), but some VMs just need all the IOPS they can get, therefore they should stay on the local disks of the HV.

My template for deploying VMs to HV-local storage has:

SCHED_DS_RANK = "FREE_MB"
SCHED_DS_REQUIREMENTS = "ID=\"102\""
SCHED_RANK = "FREE_CPU"
SCHED_REQUIREMENTS = "ID=\"2\" | ID=\"3\" | ID=\"9\" | CLUSTER_ID=\"100\""

When I try to deploy a new persistent VM (3.3 GB actual size, 20 GB virtual, qcow2) with the template, I recently get [TemplateInstantiate] Failed to clone images: Not enough space in datastore

… which most likely is because the NFS is nearly full:

root@one-1:~# df -h /var/lib/one/datastores/
Filesystem           Size  Used Avail Use% Mounted on
nfs-int:/nfs         204G  177G   17G  92% /nfs

On the Frontend, 103 is on NFS as well:

root@one-1:~# df -h /var/lib/one/datastores/*
Filesystem           Size  Used Avail Use% Mounted on
nfs-int:/nfs         204G  177G   17G  92% /nfs
nfs-int:/nfs         204G  177G   17G  92% /nfs
nfs-int:/nfs         204G  177G   17G  92% /nfs
nfs-int:/nfs         204G  177G   17G  92% /nfs
nfs-int:/nfs         204G  177G   17G  92% /nfs

On the HVs, the directories are linked (and 103 stays empty?):

root@hv-03:~# ls -la /var/lib/one/datastores/
total 8
drwxr-xr-x 2 root     root 4096 Feb 19 02:41 .
drwxr-xr-x 6 oneadmin root 4096 Feb 19 01:01 ..
lrwxrwxrwx 1 root     root   17 Feb 19 01:01 0 -> /nfs/datastores/0
lrwxrwxrwx 1 root     root   17 Feb 19 01:01 1 -> /nfs/datastores/1
lrwxrwxrwx 1 root     root   23 Feb 19 01:28 102 -> /var/lib/libvirt/images
lrwxrwxrwx 1 root     root   34 Feb 19 02:41 103 -> /var/lib/libvirt/images/one-images
lrwxrwxrwx 1 root     root   17 Feb 19 01:01 2 -> /nfs/datastores/2
lrwxrwxrwx 1 root     root   25 Feb 19 01:01 .isofiles -> /nfs/datastores/.isofiles

Did I foobar the setup somehow? I mean, why would cloning a VM from a 3.3 GB qcow2 image to a persistent image on a remote datastore need >17 GB space on the local datastore? I would expect the 3.3 GB qcow2 file be ssh’d to the destination and then some qemu-img-magic explode that to it’s real size on the remote datastore, no? Less wear and tear on the Fronend’s storage and the overall network? Or did I just miss some concept when setting up my OpenNebula cloud? :wink:

The accounted size is the size of the virtual qcow2 disk, not the size of the file. This is because it can grow up to that size. Can you check if that’s correct?

Yes, that seems to be the issue, 20 GB > 17 GB, although only 3 GB in use currently. Thanks for the clarification.