No host meets capacity and SCHED_REQUIREMENTS

Trying to use vagrant plugin for ONE 5.0. The same plugin works when using KVM host and predefined template over there.

Now I am trying to do the same with vcenter, but VM is stuck in PENDING state, no logged messages whatsoever and cannot find anything that points to the problem except for this error message in the Scheduler part of the VM:

Tue Oct 11 16:22:43 2016 : No host meets capacity and SCHED_REQUIREMENTS: (CLUSTER_ID = 0) & !(PUBLIC_CLOUD = YES) & ( ID=“2” )

Interesting part is that instantiation works fine when I try this manually. No complaints. Also, error message mentions some “PUBLIC_CLOUD” thing which I think is not defined anywhere. Where does it come from?
Anyhow, my understanding of that requirement is “cluster id has to be 0, public_cloud must be NO (!) and host ID has to be 2”… All fine with me.

Scheduler log is of no help, I can see repeated attempts (every 30 sec) of deployment:
16:19:43 2016 [Z0][VM][D]: Found 1 pending/rescheduling VMs.
16:19:43 2016 [Z0][HOST][D]: Discovered 2 enabled hosts.
16:19:43 2016 [Z0][SCHED][D]: Match-making results for VM 26:
16:19:43 2016 [Z0][SCHED][D]: Dispatching VMs to hosts:

Nothing more. VM 26 is still in PENDING state.

Can somebody help me where to look next? Trying to pinpoint the root cause somehow…

Sorry, did not see the fully parsed sched.log. The most important lines are omitted. This is how the VM 26 related info looks like:
Oct 11 16:27:13 2016 [Z0][VM][D]: Found 1 pending/rescheduling VMs.
Oct 11 16:27:13 2016 [Z0][HOST][D]: Discovered 2 enabled hosts.
Oct 11 16:27:13 2016 [Z0][SCHED][D]: Match-making results for VM 26:
Cannot schedule VM, there is no suitable host.

Oct 11 16:27:13 2016 [Z0][SCHED][D]: Dispatching VMs to hosts:
VMID Host System DS
-------------------------

Now we see the real problem. ONE thinks it has no suitable host to deploy. Yet, the ID=2 is my vmware vcenter. I tried to do the same with the name instead of ID, the result is the same. And, the whole thing works when tried manually from Sunstone.

Any idea is greatly appreciated.

You can get more information from:

1.- VM detailed information, in the template there should be an
SCHED_MESSAGE with more information
2.- Stop the scheduler, and start it again with log level set to debug (in
sched.conf) you’ll get a more detailed information.

Thanks for the tip on the debug level… Will try it. As for the sched message, I already sent the message from the VM instance:
Tue Oct 11 16:22:43 2016 : No host meets capacity and
SCHED_REQUIREMENTS: (CLUSTER_ID = 0) & !(PUBLIC_CLOUD = YES) & (
ID=“2” )

You said “in the template” - did you mean in the VM instance?

Yes in the VM instance.

However it seems that either your host (2) is not in cluster (0) (could you double check?) or there is no capacity (free CPU, MEMORY) for the VM, you can also double check from the host information.

Yesterday, the last time I tried, VM instance was created with no problems at all when doing manual instantiation. This should mean that the host is fine (of course when I look at the numbers, vmware cluster is way underutilized).

I will ramp up the debug level and then try both ways, manual and vagrant, to see the difference.

OK, debug level set at 5 and both tests done, manual and vagrant. Indeed ONE thinks it cannot deploy a VM, that is clear:

“Cannot schedule VM, there is no suitable host.”

But this is very strange, as manual instance is deployed with no issue. Also, sched.log is not giving me the exact info about the root cause. Comparing the log data from both deployments, there is no difference, except that vagrant instance complaints that both hosts are not meeting the requirements.

“Host 2 discarded for VM 28. It does not fulfill SCHED_REQUIREMENTS.”

VM 27 is the manual version and it does not complaint about host 2:

Wed Oct 12 14:32:42 2016 [Z0][SCHED][D]: Dispatching VMs to hosts:
VMID Host System DS
-------------------------
27 2 -1

Here are both sched.log parts regarding VM27 and VM 28:
vm28_vagrant_failed.txt (2.1 KB)
vm27_manual_success.txt (2.1 KB)

Here are outputs that show that sched requirements should be met. The message was

No host meets capacity and SCHED_REQUIREMENTS: (CLUSTER_ID = 0) & !(PUBLIC_CLOUD = YES) & ( ID=“2” )

And from the outputs you can see all requirements are indeed met:
[root@nebula ~]# onehost list
ID NAME CLUSTER RVM ALLOCATED_CPU ALLOCATED_MEM STAT
1 csesx5 default 3 30 / 3200 (0%) 384M / 755.6G (0%) on
2 Cloud-Cluster default 4 800 / 3200 (25%) 16G / 1.5T (1%) on

[root@nebula ~]# onehost show 2 |grep PUBLIC
PUBLIC_CLOUD="YES"

[root@nebula ~]# onecluster list
ID NAME HOSTS VNETS DATASTORES
0 default 2 3 5

Back to square one, why is ONE not allowing this instance? THANKS

No host meets capacity and SCHED_REQUIREMENTS: (CLUSTER_ID = 0) &
!(PUBLIC_CLOUD = YES) & ( ID=“2” )

[root@nebula ~]# onehost show 2 |grep PUBLIC

PUBLIC_CLOUD=“YES”

Host 2 is required to NOT, PUBLIC_CLOUD = YES, note the ! at the beginning
of the clause

Ruben S. Montero, PhD
Project co-Lead and Chief Architect
OpenNebula - Flexible Enterprise Cloud Made Simple
www.OpenNebula.org | rsmontero@opennebula.org | @OpenNebula

Damn it, it was easy… After looking through so many logs my eyes were totally blind! :weary:

Thanks a lot… Now can you just tell me what is this parameter for? After I imported vcenter it was just there…

thanks

Hypervisors are divided in two classes fully managed (like KVM) or
partially managed (like AWS or vCenter). In the latter case OpenNebula
talks to other Cloud manager instead of talking to the hypervisor. When a
host is labeled as PUBLIC_CLOUD some operations are delegated, for example
storage operations (e.g. vCenter is responsible for cloning disk images).

Cheers

Great, it all makes sense. I am very grateful for your help, always trying to really understand how system works.

Now, after all this one big question pops up. Who is actually setting scheduling requirements? I thought that all of this is set up in the VM template. You just throw the command to make an instance and that’s it. The template is “supreme authority”. But this is obviously not true, because my manual instance is working, and the same template called through vagrant is not.
And of course I want my vcenter to stay with the public_cloud setting to “yes”… do not want to change what is the logical setup just because of vagrant.

Could it be that vagrant is sort of overriding my template through API calls? Maybe there is some default behavior which is not shown in the vagrantfile.