Scheduler dispatch slow

logger · January 8, 2016, 3:18pm

I have an odd issue that i am not sure how to troubleshoot.

If i launch a new vm from a template it will hang in the pending state for up to 35mins or until i restart one.
After that the next interval triggers and the vms are dispatched and everything is fine. If instantiate more vms shortly after they to launch fine more or less in the 30seconds configured in sched,conf. but at some point after about 5-10 mins new vms will no longer be dispatched within the expected time.

i upped the log level and can see the time taken for the last 2 scheduling actions: 25 and 35 mins.

Fri Jan  8 15:08:27 2016 [Z0][SCHED][D]: Dispatching VMs to hosts. Total time: 0.02s
Fri Jan  8 15:34:03 2016 [Z0][SCHED][D]: Getting scheduled actions information. Total time: 1506.22s
Fri Jan  8 15:34:03 2016 [Z0][SCHED][D]: Getting VM and Host information. Total time: 0.01s
Fri Jan  8 16:10:02 2016 [Z0][SCHED][D]: Getting scheduled actions information. Total time: 2129.15s

As far as i can see all other activity and actions carry on as normal. Existing vms can migrate around etc. There is only one cluster of 3 centos kvm hosts. There does not appear to be any info in the various logs to indicate a reason they are waiting to be dispatched.

Thanks in advance for any insight you can offer.

ruben · January 8, 2016, 3:32pm

Hi

How many VMs do you have (i.e. number of lines in onevm list) ? How long
does it take to execute the onevm list command?

Cheers

logger · January 8, 2016, 3:48pm

10 VMs. List commands have no delay.

The resources allocated are not beyond those available. Scheduling is to spread vms between hosts. Storage is a ceph cluster. The same scheduling issue happens with 40mb ttylinux test vms and 50 gig Ubuntu vms.
The three hosts are in the same cluster and the templates require only that they are deployed to that cluster.

I can’t find evidence that the system is unable to decide which of the 3 hosts to send each vm to over than that long gathering info period. The dispatch itself takes moments.

Thank for the quick response.

ruben · January 11, 2016, 4:45pm

Hi,

I was worried about those ~1000s for the scheduled actions, those are
actually obtained through the same api call than a onevm list command. Size
of pools (VM, Host,…) is the main factor that determines the scheduling
execution time. Could it be a networking problem from the machine running
the scheduler, for example a http_proxy variable, naming resolution
issues…?

Cheers

logger · January 11, 2016, 4:53pm

I have had issues with the http proxy before but they are now resolved. The same or a similar log entry appear when I restart one. But that is because the process is briefly down and it assumes it’s unreachable. There is also no delay with name resolution either. After resetting one it always catches the pending deployments and they are out within 30 secs.

I was expecting to see something log suggesting it can come to a decision about which host to dispatch to since they are all ‘equal’. I suspect it’s something funky with my environment but I don’t see it.

ranjith · September 29, 2017, 6:52am

Hi I have a delay of 10 seconds while executing onevm list command in Opennebula sunstone. There’s nothing in the logs. Total opennebula server is slow. What is the reason behind it.

Topic		Replies	Views
VM Pending to Infinity Product Support	17	2195	June 11, 2015
Cannot dispatch VM to host, but no error Product Support	2	1189	October 6, 2021
When i create a VM it is stuck on pending Product Support	4	4305	April 10, 2018
Can we get better feedback from the scheduler? Product Support	3	339	August 6, 2021
How to troubleshoot "Stuck" VM's that wont automaticly deploy? Product Support	1	932	November 19, 2015

Scheduler dispatch slow

Related topics