Monitoring script failing to execute after os upgrade

“Wed Nov 18 16:39:19 2015 : Error monitoring Host fcl010 (165): Error executing probes”

I can still launch virtual machines on the host and they start. I can resume and boot
vm’s in poff and unkn state and that works. But the status of the host
stays in either update or retr. If I disable and re-enable the host it then stays
stuck in state init.

I’m guessing it’s something that changed in the sudo permissions but not sure what.
what is the first command that the monitoring scripts would be trying to execute, and
where would it be using sudo?

This is open nebula 4.8 by the way… I can ssh in to all the nodes pass wordlessly as oneadmin from the head node with the same credentials that the oned is using. Other nodes that weren’t touched
are still working fine.

Steve Timm

Not sure exactly what caused the jam-up…from earlier tickets in this same forum, one of which I had answered myself, I saw to try to run the run_probes command manually with the right
arguments and that worked but the state of the node still didn’t move off of “init”.
But a “onehost sync --force” did work and eventually got all the nodes back to status “on”.

Glad you fixed it.

We mention it in the upgrade guide:

Would you put it elsewhere? Thoughts?