Scheduler policies

Hello,

I would like to know if it is possible to apply some “scheduler policy” to allow OpenNebula Scheduler chooses the less load node and, also, its local datastore. My OpenNebula cluster is composed by one frontend server that, also, acts as kvm node and by one dedicated kvm server. Both servers have their local datastore that is NFS shared crossed between them (front-end shares datastores 0, 1 and 2 by NFS and kvm node shares datastores 100 and 101 by NFS), so both server “see” all datastores.
But I want that when scheduler choose one server to allocate an instance and consume “x” CPUs, also, choose its local datastore, so a running instances always will be tacking CPUs from the server where it is allocated (deployment files, datastores…).

Is it possible?

I have read in Scheduler Configuration — OpenNebula 6.0.3 documentation that I could configure scheduler with policy DEFAULT_SCHED = 2 (load-aware) that allows me that a new instance will be allocated in the node with less load. But I also need that instance use the local datastore of the selected node.

I have read in Virtual Machine Template — OpenNebula 6.0.3 documentation that exists some predefine attributes like “NAME” and “HYPERVISOR”. With these attributes, would it be possible to force that a new instace takes CPUs and datastore from the same node?

Thanks.

Hi,

I sent to @ahuertas and @cgonzalez this question.

Please, if you could help, I would be very grateful.

Thanks a lot!

Hello @Daniel_Ruiz_Molina,

You can use SCHED_DS_REQUIREMENTS, please check this.

Best,
Álex.

Hi @ahuertas,

I’m my templates, I’m already using SCHED_DS_REQUIREMENTS and SCHED_REQUIREMENTS parameter to specify host and its datastore (each host has its own datastore shared by a crossed NFS between server and node) manually. But my question is another one. The problem I have found is that if I configure policy with “DEFAULT_SCHED=2” (load-aware) it would be great and interesting that VM was created in the local datastore from the node from it is “taking” CPUs.
With “DEFAULT_SCHED=2”, OpenNebula chooses what kvm node has less load and, then, creates VM there, so VM is consuming CPUs and RAM from that node. However, datastore is “0” (system) and not the local datastore from the kvm node.

I attach an screenshot from the “Scheduling” tab in VM template. I can choose policy “Load-Aware” for host rank, but there is no any policy to choose the (local) datastore owned from the choosen host.

Thanks.

Hi,

Could you help me? Some news in version 6.0.0.3, 6.2 or 6.4? @ahuertas @cgonzalez

For me, it would be very important an automatic datastore scheduling when I apply "DEFAULT_SCHED=2" (load-aware), because my datastores aren’t shared across all cluster (not CEPTH, Lustre, Gluster); each node (just 2) have its own datastore and, then, using NFS, server shares 0, 1 and 2 and kvm-node shares 100 and 101 for allowing migrate process if it would be necessary).
But if I apply "DEFAULT_SCHED=2" (load-aware) and scheduler chooses kvm-node, datastore need to be 100 or 101, but OpenNebula scheduler chooses 0, 1 or 2 and, then, VM is created in server node (consuming GB of free space) but is consuming CPUs from kvm-node…

Thanks.

Hi @Daniel_Ruiz_Molina,

Please, correct me if I’m wrong, but whats the point of sharing the local storage from each node across both of them? Who don’t you just use an ssh datastore which by default will use the local storage of each node?

Note that migration, both cold and live are supported by ssh driver.

Hi @cgonzalez,

In a first moment, when I configured the second node (the first version of the cluster was formed only by a powerfull server that act as server and node KVM), I noticed that when I instantiated a VM in the second node, the instance fell in error because second node didnt have mounted “datastore-0” and/or “datastore-1”. So I though that that datastores needed to be shared in the KVM node and new datastore “system” in new node needed to be shared in server… Is it wrong?
My four datastores (three in server and one in node KVM) are configured in the following way:

  • system (0):
**DATASTORE 0 INFORMATION**
ID             : 0
NAME           : system
USER           : oneadmin
GROUP          : oneadmin
CLUSTERS       : 0
TYPE           : SYSTEM
DS_MAD         : -
TM_MAD         : qcow2
BASE PATH      : /var/lib/one//datastores/0
DISK_TYPE      : FILE
STATE          : READY
**DATASTORE CAPACITY**
TOTAL:         : 7.6T
FREE:          : 7T
USED:          : 656.9G
LIMIT:         : -
**PERMISSIONS**
OWNER          : um-
GROUP          : u--
OTHER          : ---
**DATASTORE TEMPLATE**
ALLOW_ORPHANS="NO"
DISK_TYPE="FILE"
DS_MIGRATE="YES"
PRIORITY="100"
RESTRICTED_DIRS="/"
SAFE_DIRS="/var/tmp"
SHARED="YES"
TM_MAD="qcow2"
TYPE="SYSTEM_DS"
  • default (1):
**DATASTORE 1 INFORMATION**
ID             : 1
NAME           : default
USER           : oneadmin
GROUP          : oneadmin
CLUSTERS       : 0
TYPE           : IMAGE
DS_MAD         : fs
TM_MAD         : qcow2
BASE PATH      : /var/lib/one//datastores/1
DISK_TYPE      : FILE
STATE          : READY
**DATASTORE CAPACITY**
TOTAL:         : 7.6T
FREE:          : 7T
USED:          : 656.9G
LIMIT:         : -
**PERMISSIONS**
OWNER          : um-
GROUP          : u--
OTHER          : ---
**DATASTORE TEMPLATE**
ALLOW_ORPHANS="NO"
CLONE_TARGET="SYSTEM"
CLONE_TARGET_SSH="SYSTEM"
DISK_TYPE="FILE"
DISK_TYPE_SSH="FILE"
DRIVER="qcow2"
DS_MAD="fs"
LN_TARGET="NONE"
LN_TARGET_SSH="SYSTEM"
RESTRICTED_DIRS="/"
SAFE_DIRS="/var/tmp"
TM_MAD="qcow2"
TM_MAD_SYSTEM="ssh"
TYPE="IMAGE_DS"
  • files (2):
**DATASTORE 2 INFORMATION**
ID             : 2
NAME           : files
USER           : oneadmin
GROUP          : oneadmin
CLUSTERS       : 0
TYPE           : FILE
DS_MAD         : fs
TM_MAD         : qcow2
BASE PATH      : /var/lib/one//datastores/2
DISK_TYPE      : FILE
STATE          : READY
**DATASTORE CAPACITY**
TOTAL:         : 7.6T
FREE:          : 7T
USED:          : 656.9G
LIMIT:         : -
**PERMISSIONS**
OWNER          : um-
GROUP          : u--
OTHER          : ---
**DATASTORE TEMPLATE**
ALLOW_ORPHANS="NO"
CLONE_TARGET="SYSTEM"
DRIVER="qcow2"
DS_MAD="fs"
LN_TARGET="NONE"
RESTRICTED_DIRS="/"
SAFE_DIRS="/var/tmp"
TM_MAD="qcow2"
TYPE="FILE_DS"
  • system2 (103) (in KVM node):
**DATASTORE 103 INFORMATION**
ID             : 103
NAME           : system2
USER           : oneadmin
GROUP          : oneadmin
CLUSTERS       : 0
TYPE           : SYSTEM
DS_MAD         : -
TM_MAD         : qcow2
BASE PATH      : /var/lib/one//datastores/103
DISK_TYPE      : FILE
STATE          : READY
**DATASTORE CAPACITY**
TOTAL:         : 9.1T
FREE:          : 7.9T
USED:          : 1.2T
LIMIT:         : -
**PERMISSIONS**
OWNER          : um-
GROUP          : u--
OTHER          : ---
**DATASTORE TEMPLATE**
ALLOW_ORPHANS="NO"
BRIDGE_LIST="192.168.14.102"
DISK_TYPE="FILE"
DS_MIGRATE="YES"
PRIORITY="50"
RESTRICTED_DIRS="/"
SAFE_DIRS="/var/tmp"
SHARED="YES"
TM_MAD="qcow2"
TYPE="SYSTEM_DS"

With this configuration, all is working fine… except automatic scheduling using "DEFAULT_SCHED=2" (load-aware).

After reading your message, if I unshare (NFS mount points down), when I instantiate a VM in second node (with datastore 103), should process run OK?

And about schedule policies, could you tell me if there is a solution to my question?

Thanks @cgonzalez !!

If I’m understanding your problem properly, using ssh driver will behave exactly as you want without defining any schedule policy.

What I propose is to have only one image DS with TM_MAD=ssh and also just one system DS with the same TM_MAD=ssh. This way you don’t need to cross mount the storage of the hypervisors and when a VM is deployed in any hypervisor it will use is local storage [1].

The only difference with your current environment is that during the PROLOG phase the images will be copied from the frontend to the node using ssh* (as the storage is not shared anymore).

* As datastores will be using ssh driver OpenNebula will always copy the images via ssh even for the hypervisor node which is in the same host than the frontend.

[1] Local Storage Datastore — OpenNebula 6.4.0 documentation

Hi @cgonzalez,

So, if I reconfigure my DS as you say, do I need to have configured a “cluster” (Infrastructure → Clusters)? When I added the second node (kvm), I created a “cluster” as you can see in these pictures:

I created that because when I tried to instantiate a VM in the “new” local datastore (in the new kvm node), OpenNebula could not access to it. DS 0, 1, 2 belongs to my server node (that also acts as kvm) and DS 103 is the local datastore in KVM node (I don’t have a large shared datastore as Cepth, Lustre, Gluster and so on…).

Because of I will use “ssh” instead of “qcow2” as TM_MAD, will the image transfer (copy) faster or slower from server to node?

I will try what you explain in a small OpenNebula Virtualbox scenario, before applying these new configuration to my production environment (in an University-College…)

Thanks a lot!

Hi @cgonzalez,

I have tested your configuration and I have got a problem. After configuring datastore 0 (system) and datastore 1 (default, image) with TM_MAD=“ssh”, instance can’t be deployed (waiting in “pending”) and “onedatastore show 0” shows 0 in capacity:

[oneadmin@localhost ~]$ onedatastore show 0
DATASTORE 0 INFORMATION                                                         
ID             : 0                   
NAME           : system              
USER           : oneadmin            
GROUP          : oneadmin            
CLUSTERS       : 0                   
TYPE           : SYSTEM              
DS_MAD         : -                   
TM_MAD         : ssh                 
BASE PATH      : /var/lib/one//datastores/0
DISK_TYPE      : FILE                
STATE          : READY               

DATASTORE CAPACITY                                                              
TOTAL:         : -                   
FREE:          : -                   
USED:          : -                   
LIMIT:         : -                   

PERMISSIONS                                                                     
OWNER          : um-                 
GROUP          : u--                 
OTHER          : ---                 

DATASTORE TEMPLATE                                                              
ALLOW_ORPHANS="YES"
DISK_TYPE="FILE"
DS_MIGRATE="YES"
RESTRICTED_DIRS="/"
SAFE_DIRS="/var/tmp"
SHARED="NO"
TM_MAD="ssh"
TYPE="SYSTEM_DS"

IMAGES

My datastore 0 is:

And my datastore 1 is:

Also, before doing this reconfiguration (TM_MAD=“ssh”), I have unshared datastores via NFS (as you said me). The result of an instantiation is that process fails because “scp” transfer begins in the kvm node, not in the frontend server, so scp tries to copy from a destination /var/lib/one/datastore/1 that not exists in kvm node. After doing NFS crossing shared, kvm node has that datastored “as” local.

So about the problem with TM_MAD=“ssh”, what can I do? Now, instantiation process fails and there is no VM log (?).

Thanks @cgonzalez !!

Hi @Daniel_Ruiz_Molina,

Let me try to clarify the point of using ssh drivers, to help you understand the scenario I described. The aim of ssh datastores is to use the local storage of each hypervisor nodes, this way no shared storage is assumed and every image transfer is made using ssh protocol.

As the local storage of each hypervisor is used, when you check the capacity of the Datastore no capacity is shown, as it depends on the hypervisor. The scheduler before deploying the VM will check if the selected hypervisor have enough storage for deploying the VM in there, if not it will choose another.

This way no cluster is needed for ensuring that the storage does exists or not, because OpenNebula will always use the local storage of the hypervisor (i.e the storage backing the Datastore location).

In order to check why your VM is stuck in pending you can increase the log verbosity for the scheduler and check for errors at /var/log/one/sched.log. The typical issues are:

  • Host are not properly monitored (ensure SSH access)
  • The VM doesn’t fit in the hypervisor (check VM capacity). Is the storage under the DS folder using the disk/partition intended for this? If not you can manage this with symbolic links.

The result of an instantiation is that process fails because “scp” transfer begins in the kvm node, not in the frontend server, so scp tries to copy from a destination /var/lib/one/datastore/1 that not exists in kvm node.

This doesn’t looks good to me, unless some bridge server is configured OpenNebula will assume that the images are by default accessible from the Frontend node. But it shouldn’t look for it in the hypervisor. Can you share the exact error?

Hi @cgonzalez,

Now, I’m testing in my OpenNebula Virtualbox miniCluster what you said to me yesterday. After changing datastores 0 (system) and 1 (default,image) to TM_MAD=ssh, remove datastores 101 and 102 from the kvm node and unshared crossed NFS (between server and kvm node), and after applying placement scheduling policy in my template to host rank FREE_CPU (to allow scheduling directly to server and choose what node has more free cpus), what I have got is that I have been able to instante “n” instances in my server until it has shared all CPUs, but when I have instantiate two instances more that should be created in kvm node, instances have failed and logs in sched.log show this information:

Thu May 19 12:10:50 2022 [Z0][SCHED][D]: Setting VM groups placement constraints. Total time: 0.00s
Thu May 19 12:10:50 2022 [Z0][SCHED][D]: Host 0 discarded for VM 87. Not enough CPU capacity: 50/0
Thu May 19 12:10:50 2022 [Z0][RANK][D]: Rank evaluation for expression : FREE_CPU
Thu May 19 12:10:50 2022 [Z0][RANK][D]: ID: 1 Rank: 400
Thu May 19 12:10:50 2022 [Z0][RANK][D]: Rank evaluation for expression : FREE_CPU
Thu May 19 12:10:50 2022 [Z0][RANK][D]: ID: 0 Rank: 0
Thu May 19 12:10:50 2022 [Z0][SCHED][D]: Match Making statistics:
	Number of VMs:             1
	Total time:                0s
	Total Cluster Match time:  0s
	Total Host Match time:     0.00s
	Total Host Ranking time:   0.00s
	Total DS Match time:       0.00s
	Total DS Ranking time:     0.00s
	Total Network Match time:  0.00s
	Total Network Ranking time:0s

Thu May 19 12:10:50 2022 [Z0][SCHED][D]: Scheduling Results:
Virtual Machine: 87

	PRI	ID - HOSTS
	------------------------
	1	1

	PRI	ID - DATASTORES
	------------------------
	0	0


Thu May 19 12:10:50 2022 [Z0][SCHED][D]: Dispatching VMs to hosts:
	VMID	Priority	Host	System DS
	--------------------------------------------------------------
	87	0		1	0

It seems that schedules chooses correctly server number #1, but continues choosing datastore #0 that belongs to server node, not kvm node… Also, VMs details shows:
Thu May 19 12:10:50 2022: Error executing image transfer script: Error creating directory /var/lib/one//datastores/0/87 at nodo: mkdir: cannot create directory '/var/lib/one//datastores/0/87': Permission denied

I don’t understand this message… It seems server is trying to execute scp from datastore 0 to datastore 0… Mmmm… Why? You said to me yesterday that OpenNebula always use the local storage of the hypervisor, so if I’m not choosing what datastore I want, I suppose that datastore would be the local datastore of the kvm node where the instance is being created…

Also, I comment that I didn’t create a new cluster. Both nodes (server and kvm node) are in the “default” cluster and, also, I have deleted datastores created from the kvm node, so in Storage->Datastores menu I only see datastores 0, 1 and 2.

Please, help! Thanks a lot!!!

Hi again @cgonzalez,

After checking logs, I have decided to create subfolder structure /var/lib/one/datastores/{0,1,2} with oneadmin as user and oneadmin as group in kvm node, and voilà, now VMs have instantiate perfectly :wink:

I have check that datastore 0 in server doesn’t contains VMs created in node kvm “datastore 0”. You could check here:
/var/lib/one/datastore 0 in server:

[root@localhost 0]# tree
.
├── 77
│   ├── deployment.0
│   ├── disk.0
│   ├── disk.1
│   ├── ds.xml
│   └── vm.xml
├── 78
│   ├── deployment.0
│   ├── disk.0
│   ├── disk.1
│   ├── ds.xml
│   └── vm.xml
├── 79
│   ├── deployment.0
│   ├── disk.0
│   ├── disk.1
│   ├── ds.xml
│   └── vm.xml
└── 80
    ├── deployment.0
    ├── disk.0
    ├── disk.1
    ├── ds.xml
    └── vm.xml

4 directories, 20 files

/var/lib/one/datastore/0 in kvm node:

[root@nodo 0]# tree 
.
├── 88
│   ├── deployment.0
│   ├── disk.0
│   ├── disk.1
│   ├── ds.xml
│   └── vm.xml
├── 89
│   ├── deployment.0
│   ├── disk.0
│   ├── disk.1
│   ├── ds.xml
│   └── vm.xml
├── 90
│   ├── deployment.0
│   ├── disk.0
│   ├── disk.1
│   ├── ds.xml
│   └── vm.xml
├── 91
│   ├── deployment.0
│   ├── disk.0
│   ├── disk.1
│   ├── ds.xml
│   └── vm.xml
└── 92
    ├── deployment.0
    ├── disk.0
    ├── disk.1
    ├── ds.xml
    └── vm.xml

5 directories, 25 files

As you could see here, my minicluster is running 9 instances: 4 in localhost (server) and 5 in nodo (kvm node):

And in Infrastructure->Host I see this:

So it seems now all is working fine, isn’t it?

Please, confirm me if I am doing all correctly.

Thanks again!

1 Like

Hi @Daniel_Ruiz_Molina,

It looks pretty good to me!

I have check that datastore 0 in server doesn’t contains VMs created in node kvm “datastore 0”

Yep, that’s the expected behavior, note that SSH Datastores are not shared, so you’ll only see the VMs running on each node. In other words each host will have “its own Datastore 0”.