What's wrong? maybe a bug?

Hello, im using Opennebula latest version for my vmware cluster (22 hosts - 12 datastores), active and monitored in opennebula is just 30 vm. From some weeks my opennebula server going to high load, i checked /var/log/one/oned.log and no errors but the vm status became UNKNOWN.
I had done some tests and discovered that after restart opennebula the load is low… after 10/15 minutes the first 100 is at 100% and so on for all the server cpus, so for me the only solution is just restart opennebula.
I suppose that ruby is getting all the server resources infact after 1 hour i can see serval processes as this:

/bin/bash /var/lib/one/remotes/im/run_probes vcenter /var/lib/one//datastores 4124 20 4 Cluster

I attached the HTOP resut…

What’s wrong with my opennebula installation ?
Thanks

It seems that monitoring takes more that 60 seconds to finish, the default monitoring interval. This causes new monitoring processes to start before the previous one finished.

  • Stop OpenNebula
  • Clean the host of run probes / ruby processes
  • Time the execution of the monitoring agent (as oneadmin):
$ time /bin/bash /var/lib/one/remotes/im/run_probes vcenter /var/lib/one//datastores 4124 20 4 Cluster
  • Change the monitoring interval to a higher value that the time you got in the last step

Hello Javi… i increased MONITORING_INTERVAL = 240
but same result… all cpu at 100%

How much did the manual monitoring took?

How i can check it ?
thanks

Javi already told you. :slight_smile:

1 Like

Just tried… im getting the following error:

[oneadmin@cloud root]$ time /bin/bash /var/lib/one/remotes/im/run_probes vcenter /var/lib/one//datastores 4124 20 4 Cluster
/usr/lib/one/ruby/vcenter_driver.rb:854:in rescue in initialize_one': Error initializing OpenNebula client: Error getting oned configuration : Connection refused - connect(2) (RuntimeError) from /usr/lib/one/ruby/vcenter_driver.rb:837:ininitialize_one’
from /usr/lib/one/ruby/vcenter_driver.rb:161:in initialize' from ./vcenter.rb:37:innew’
from ./vcenter.rb:37:in `'
ERROR MESSAGE --8<------
Error executing vcenter.rb
ERROR MESSAGE ------>8–

real 0m0.485s
user 0m0.434s
sys 0m0.053s

Checked… taking 20 minutes… how’s possible ?
Thanks