New install, nodes stuck in error monitoring

I have a new cluster of 6 physical nodes running KVM on Debian 12 and the Frontend/Sunstone as a guest VM on one of the nodes. All nodes and frontend can communicate over the network and ssh passwordless authentication and sudo working perfectly. This is version 7.0 by the way.

When I add a node using KVM in the sunstone gui it just is stuck with error on monitoring, same in command line:

ERROR="Tue Aug 12 17:07:36 2025 : Error monitoring Host 10.10.3.65 (31): "

I have the pre-requisite packages installed on all the physical nodes as per instructions.

I can onehost enable any host and they go into init then back to error.

I have redeployed the frontend 3 times now from scratch, I can’t see what I’m doing wrong. I’m using mariadb backend instead of sqlite.

I see some ruby errors in /var/log/one/monitor.log - any ideas how it could have ended up like this or how to fix it? Maybe its a bug of some sort?

Thanks,

G.

”Tue Aug 12 17:38:53 2025 [Z0][HMM][E]: Unable to monitor host id: 31 Tue Aug 12 17:38:53 2025 [Z0][MDP][W]: Error parsing start message for host 31: <MONITOR_MESSAGES><SYSTEM_HOST>Error executing monitor_ds.rb: /usr/lib/ruby/3.1.0/fileutils.rb:243:in mkdir': Permission denied @ dir_s_mkdir - /var/lib/one//datastores (Errno::EACCES) from /usr/lib/ruby/3.1.0/fileutils.rb:243:in fu_mkdir’ from /usr/lib/ruby/3.1.0/fileutils.rb:221:in block (2 levels) in mkdir_p' from /usr/lib/ruby/3.1.0/fileutils.rb:219:in reverse_each’ from /usr/lib/ruby/3.1.0/fileutils.rb:219:in block in mkdir_p' from /usr/lib/ruby/3.1.0/fileutils.rb:211:in each’ from /usr/lib/ruby/3.1.0/fileutils.rb:211:in mkdir_p' from /var/tmp/one/im/kvm.d/../kvm-probes.d/host/system/monitor_ds.rb:53:in initialize’ from /var/tmp/one/im/kvm.d/../kvm-probes.d/host/system/monitor_ds.rb:158:in new' from /var/tmp/one/im/kvm.d/../kvm-probes.d/host/system/monitor_ds.rb:158:in ’ </SYSTEM_HOST> Tue Aug 12 17:38:53 2025 [Z0][HMM][E]: Unable to monitor host id: 32 Tue Aug 12 17:38:53 2025 [Z0][MDP][W]: Error parsing start message for host 32: <MONITOR_MESSAGES><SYSTEM_HOST>Error executing monitor_ds.rb: /usr/lib/ruby/3.1.0/fileutils.rb:243:in mkdir': Permission denied @ dir_s_mkdir - /var/lib/one//datastores (Errno::EACCES) from /usr/lib/ruby/3.1.0/fileutils.rb:243:in fu_mkdir’ from /usr/lib/ruby/3.1.0/fileutils.rb:221:in block (2 levels) in mkdir_p' from /usr/lib/ruby/3.1.0/fileutils.rb:219:in reverse_each’ from /usr/lib/ruby/3.1.0/fileutils.rb:219:in block in mkdir_p' from /usr/lib/ruby/3.1.0/fileutils.rb:211:in each’ from /usr/lib/ruby/3.1.0/fileutils.rb:211:in mkdir_p' from /var/tmp/one/im/kvm.d/../kvm-probes.d/host/system/monitor_ds.rb:53:in initialize’ from /var/tmp/one/im/kvm.d/../kvm-probes.d/host/system/monitor_ds.rb:158:in new' from /var/tmp/one/im/kvm.d/../kvm-probes.d/host/system/monitor_ds.rb:158:in ’ </SYSTEM_HOST> Tue Aug 12 17:38:54 2025 [Z0][MDP][I]: Command execution failed (exit code: 1): ‘if [ -x “/var/tmp/one/im/run_monitord_client” ]; then /var/tmp/one/im/run_monitord_client kvm 29 10.10.3.63; else exit 42; fi’ Tue Aug 12 17:38:54 2025 [Z0][MDP][I]: Error executing monitord-client_control.sh: /usr/bin/env: ‘ruby’: No such file or directory Tue Aug 12 17:38:54 2025 [Z0][MDP][W]: Start monitor failed for host 29: Error executing monitord-client_control.sh: /usr/bin/env: ‘ruby’: No such file or directory Tue Aug 12 17:38:54 2025 [Z0][HMM][E]: Unable to monitor host id: 29 Tue Aug 12 17:38:54 2025 [Z0][MDP][I]: Command execution failed (exit code: 1): ‘if [ -x “/var/tmp/one/im/run_monitord_client” ]; then /var/tmp/one/im/run_monitord_client kvm 28 10.10.3.62; else exit 42; fi’ Tue Aug 12 17:38:54 2025 [Z0][MDP][I]: Error executing monitord-client_control.sh: /usr/bin/env: ‘ruby’: No such file or directory Tue Aug 12 17:38:54 2025 [Z0][MDP][W]: Start monitor failed for host 28: Error executing monitord-client_control.sh: /usr/bin/env: ‘ruby’: No such file or directory Tue Aug 12 17:38:54 2025 [Z0][HMM][E]: Unable to monitor host id: 28”

I’d run chown -R oneadmin:oneadmin /var/lib/one and verify that the opennebula related processes are owned by the oneadmin user. Also, check the AppArmor logs as another possible vector for errors.

That indicates that the ruby is missing or not reachable by the oneadmin user.

Hope this helps,

Anton Todorov

Thank you that really did help!!! I was starting to feel defeated by this. I was assuming some code issue with how ruby was being interpreted - but that was a ruse.

So the concept of remotes was confusing me a bit. I didn’t understand what ran from where. So (detailing this for anybody who encounters similar issues) - Remotes exist to define how a hosts interact with the sunstone frontend, kvm in my case. They exist on the frontend initially but are sync’d to the host over ssh. At that point, an instance of those scripts is copied to /var/ on the hosts and then run on the hosts which gathers stats to send back to the frontend monitoring. What confused me a little was that it seemed to be a bash script with a bash shebang but had a lot of ruby in it. I started changing the shebang on the assumption that it should be ruby running it directly. There was no need for me to go near that, that was just causing further errors. I also thought maybe it was apparmor as you suggest but ruled that out. I even thought maybe the bash script was incorrectly running under more restricted dash shell on the host - again that was incorrect. But you were correct in that it was a permissions problem. I’d glossed over that.

Ultimately what fixed it for me was that oneadmin had insufficient permissions to create on the host in /var/lib/one. This solved it:

chown -R oneadmin:oneadmin /var/lib/one/

Then oneadmin sync and enable and it finally worked.

How or why the permissions were insufficient on a vanilla install, I don’t know.
But in a way, this has been useful I understand the architecture a bit better.

Thanks again!

G.

1 Like