Mon Jul 11 20:49:37 2016 [Z0][InM][I]: Command execution fail: 'if [ -x “/var/tmp/one/im/run_probes” ]; then /var/tmp/one/im/run_probes kvm /var/lib/one//datastores 4124 20 4 node-01.host.com; else exit 42; fi’
Mon Jul 11 20:49:37 2016 [Z0][InM][I]: Warning: Permanently added ‘node-01.host.com,192.168.22.222’ (ECDSA) to the list of known hosts.
Mon Jul 11 20:49:37 2016 [Z0][InM][I]: /var/tmp/one/im/run_probes: line 34: 23932 Aborted ./$i $ARGUMENTS
Mon Jul 11 20:49:37 2016 [Z0][InM][E]: Error executing collectd-client.rb
Mon Jul 11 20:49:37 2016 [Z0][InM][I]: ExitCode: 134
This is probably because of missing dependency in the host. Could you
double check that everything is installed? Also you may try to execute the
command directly in the host to debug it:
/var/tmp/one/im/run_probes kvm /var/lib/one//datastores 4124 20 4
node-01.host.com
Hi Ruben,
Thanks for the reply, actually it was running since six months, recently i got such error,
now as per you guide when i run it shows following error
[root@f1-cloud-01 ~]# /var/tmp/one/im/run_probes kvm /var/lib/one//datastores 4124 20 4
/var/tmp/one/im/run_probes: line 34: 16015 Aborted ./$i $ARGUMENTS
ERROR MESSAGE --8<------
Error executing collectd-client.rb
ERROR MESSAGE ------>8–
[root@f1-cloud-01 ~]#
Not sure this is your case but we had problem with Opennebula with default (/var/tmp/one) location. In Centos 7 files are deleted from there after 30 days. Opennebula tries to copy missing files, but I think not all of them. So after moving files to another location by editing oned.conf ( variable SCRIPTS_REMOTE_DIR ) everything became fine. (Opennebula 4.12)
@denis-ldv: using a directory in /var/tmp is also a local security hole, provided that the host OS is not used solely for OpenNebula. I also have setup SCRIPTS_REMOTE_DIR myself (i used /var/lib/one/tmp), but I don’t understand why something like this is not the default.
any idea what do i do now ?
For example
- Change SCRIPTS_REMOTE_DIR to /one-scripts
- Create /one-scripts on all cluster hosts
- chown oneadmin:oneadmin /one-scripts
- Restart opennebula
- Force scripts sync at frontend by “onehost sync”
As per you guide,
i created on host
mkdir /oned-remotescript
chown -R oneadmin:oneadmin /oned-remotescript
on controller changed
SCRIPTS_REMOTE_DIR=/oned-remotescript
then restart
systemctl restart opennebula
su - oneadmin
[oneadmin@controller ~]$ onehost sync
- Adding f1-cloud-02.f1soft.com to upgrade
[========================================] 1/1 f1-cloud-02.f1soft.com
All hosts updated successfully.
Still i am getting below error
Sat Jul 30 18:35:32 2016 [Z0][ReM][D]: Req:3632 UID:0 VirtualMachinePoolInfo result SUCCESS, "<VM_POOL>135…"
Sat Jul 30 18:35:40 2016 [Z0][InM][I]: Command execution fail: 'if [ -x “/oned-remotescript/im/run_probes” ]; then /oned-remotescript/im/run_probes kvm /var/lib/one//datastores 4124 20 4 f1-cloud-01.f1soft.com; else exit 42; fi’
Sat Jul 30 18:35:40 2016 [Z0][InM][I]: Warning: Permanently added ‘f1-cloud-01.f1soft.com,10.13.222.217’ (ECDSA) to the list of known hosts.
Sat Jul 30 18:35:40 2016 [Z0][InM][I]: /oned-remotescript/im/run_probes: line 34: 32709 Aborted ./$i $ARGUMENTS
Sat Jul 30 18:35:40 2016 [Z0][InM][E]: Error executing collectd-client.rb
Sat Jul 30 18:35:40 2016 [Z0][InM][I]: ExitCode: 134
Sat Jul 30 18:35:40 2016 [Z0][InM][I]: Command execution fail: 'if [ -x “/oned-remotescript/im/run_probes” ]; then /oned-remotescript/im/run_probes kvm /var/lib/one//datastores 4124 20 4 f1-cloud-01.f1soft.com; else exit 42; fi’
Sat Jul 30 18:35:40 2016 [Z0][InM][I]: Warning: Permanently added ‘f1-cloud-01.f1soft.com,10.13.222.217’ (ECDSA) to the list of known hosts.
Sat Jul 30 18:35:40 2016 [Z0][InM][I]: /oned-remotescript/im/run_probes: line 34: 32710 Aborted ./$i $ARGUMENTS
Sat Jul 30 18:35:40 2016 [Z0][InM][E]: Error executing collectd-client.rb
Sat Jul 30 18:35:40 2016 [Z0][InM][I]: ExitCode: 134
Sat Jul 30 18:35:49 2016 [Z0][AuM][D]: Message received: AUTHENTICATE SUCCESS 6 -
So. There is another problem, not as we had.
Host is centos
[root@f1-cloud-01 ~]# cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)
[root@f1-cloud-01 ~]#
When i run manually on host.
[root@f1-cloud-01 ~]# /oned-remotescript/im/run_probes kvm /var/lib/one//datastores 4124 20 4
/oned-remotescript/im/run_probes: line 34: 4942 Aborted ./$i $ARGUMENTS
ERROR MESSAGE --8<------
Error executing collectd-client.rb
ERROR MESSAGE ------>8–
[root@f1-cloud-01 ~]#
show please
ls -l /oned-remotescript/im/kvm.d
If i understand code correctly string “Error executing collectd-client.rb” can appear only if collectd-client.rb has executable bit. But in default installation it doesn’t have it.
[root@f1-cloud-01 ~]# ls -l /oned-remotescript/im/kvm.d
total 8
-rwxr-xr-x 1 oneadmin oneadmin 2901 Nov 26 2015 collectd-client_control.sh
-rwxr-xr-x 1 oneadmin oneadmin 4066 Nov 26 2015 collectd-client.rb
[root@f1-cloud-01 ~]#
when i look into host
tail -f /var/log/messages
Jul 30 22:50:26 f1-cloud-01 dbus[30475]: [system] Failed to activate service ‘org.freedesktop.login1’: timed out
Jul 30 22:50:26 f1-cloud-01 dbus-daemon: dbus[30475]: [system] Failed to activate service ‘org.freedesktop.login1’: timed out
Jul 30 22:51:27 f1-cloud-01 dbus[30475]: [system] Activating via systemd: service name=‘org.freedesktop.login1’ unit='dbus-org.freedesktop.login1.service’
Jul 30 22:51:27 f1-cloud-01 dbus-daemon: dbus[30475]: [system] Activating via systemd: service name=‘org.freedesktop.login1’ unit='dbus-org.freedesktop.login1.service’
Jul 30 22:51:27 f1-cloud-01 systemd: Started Login Service.
Jul 30 22:51:52 f1-cloud-01 dbus[30475]: [system] Failed to activate service ‘org.freedesktop.login1’: timed out
Jul 30 22:51:52 f1-cloud-01 dbus-daemon: dbus[30475]: [system] Failed to activate service ‘org.freedesktop.login1’: timed out
Jul 30 22:52:53 f1-cloud-01 dbus[30475]: [system] Activating via systemd: service name=‘org.freedesktop.login1’ unit='dbus-org.freedesktop.login1.service’
Jul 30 22:52:53 f1-cloud-01 dbus-daemon: dbus[30475]: [system] Activating via systemd: service name=‘org.freedesktop.login1’ unit='dbus-org.freedesktop.login1.service’
Jul 30 22:52:53 f1-cloud-01 systemd: Started Login Service.
You changed default permissions somehow. Remove x flag fom collectd-client.rb.
- At frontend: “chmod a-x /var/lib/one/remotes/im/kvm.d/collectd-client.rb”
- resync scripts with hosts
- check if “ls -l /oned-remotescript/im/kvm.d” shows no X flag on collectd-client.rb
- try to run “/oned-remotescript/im/run_probes kvm /var/lib/one//datastores 4124 20 4”
Yes !!!
Thank you Denis, After fixing permission of collected-client.rb it is working now.
Sun Jul 31 16:12:27 2016 [Z0][InM][D]: Monitoring host f1-cloud-01.f1soft.com (4)
Sun Jul 31 16:12:27 2016 [Z0][InM][D]: Monitoring host f1-cloud-02.f1soft.com (7)
Sun Jul 31 16:12:32 2016 [Z0][ImM][D]: Datastore system (0) successfully monitored.
Sun Jul 31 16:12:32 2016 [Z0][InM][D]: Host f1-cloud-02.f1soft.com (7) successfully monitored.
Sun Jul 31 16:12:32 2016 [Z0][VMM][D]: VM 139 successfully monitored: STATE=a CPU=1.5 MEMORY=8408124 NETRX=1845044779 NETTX=7338000951 DISK_SIZE=[ ID=0, SIZE=16709 ] DISK_SIZE=[ ID=1, SIZE=0 ]
Sun Jul 31 16:12:32 2016 [Z0][VMM][D]: VM 147 successfully monitored: STATE=a CPU=2.5 MEMORY=3179632 NETRX=1278616278 NETTX=377023787 DISK_SIZE=[ ID=0, SIZE=6106 ] DISK_SIZE=[ ID=1, SIZE=0 ]
Sun Jul 31 16:12:32 2016 [Z0][VMM][D]: VM 208 successfully monitored: STATE=a CPU=8.5 MEMORY=2097152 NETRX=22382660474 NETTX=25079890027 DISK_SIZE=[ ID=0, SIZE=2785 ] DISK_SIZE=[ ID=1, SIZE=0 ]
Sun Jul 31 16:12:32 2016 [Z0][VMM][D]: VM 211 successfully monitored: STATE=a CPU=6.5 MEMORY=6326240 NETRX=19633944334 NETTX=458568053 DISK_SIZE=[ ID=0, SIZE=101432 ] DISK_SIZE=[ ID=1, SIZE=0 ]
Sun Jul 31 16:12:32 2016 [Z0][VMM][D]: VM 219 successfully monitored: STATE=a CPU=4.5 MEMORY=4194304 NETRX=1338193257 NETTX=586703166 DISK_SIZE=[ ID=0, SIZE=17328 ] DISK_SIZE=[ ID=1, SIZE=0 ]
Sun Jul 31 16:12:32 2016 [Z0][VMM][D]: VM 309 successfully monitored: STATE=a CPU=0.5 MEMORY=2097152 NETRX=1061387246 NETTX=1297683539 DISK_SIZE=[ ID=0, SIZE=15467 ] DISK_SIZE=[ ID=1, SIZE=0 ]
Sun Jul 31 16:12:40 2016 [Z0][ReM][D]: Req:7696 UID:0 VirtualMachinePoolInfo invoked , -2, -1, -1, -1
Sun Jul 31 16:12:40 2016 [Z0][ReM][D]: Req:7696 UID:0 VirtualMachinePoolInfo result SUCCESS, "<VM_POOL>135…"
Sun Jul 31 16:12:40 2016 [Z0][ReM][D]: Req:9088 UID:0 VirtualMachinePoolInfo invoked , -2, -1, -1, -1
Sun Jul 31 16:12:40 2016 [Z0][ReM][D]: Req:9088 UID:0 VirtualMachinePoolInfo result SUCCESS, “<VM_POOL>135…”