Timeout executing 'if [ -x "/var/tmp/one/im/run_probes" ]

superxor · July 18, 2019, 6:55am

v 5.8

I just setup 20 new hosts and one of them has the following issue when being freshly added:
Thu Jul 18 08:52:13 2019 : Error monitoring Host 10.0.3.58 (53): Timeout executing ‘if [ -x “/var/tmp/one/im/run_probes” ]; then /var/tmp/one/im/run_probes kvm /var/lib/one//datastores 4124 60 53 10.0.3.58; else exit 42; fi’

I am not sure why, SSH works fine in both directions, rest is the same as on the other hosts. I tried to reenable it, tried to delete /var/tmp/one, tried to scp /var/tmp/one…

Any ideas?

ahuertas · July 18, 2019, 7:50am

Hello @superxor

Do those hosts have running VMs in libvirt?

superxor · July 18, 2019, 8:15am

No,
brand new fresh OS installs. Ubuntu 18.03
Only happened on one node, curious

ahuertas · July 18, 2019, 9:57am

Could you please try to execute the command manually in the host and see the error?

superxor · July 18, 2019, 10:05am

edit:
if I run it manually I get the whole node info:

ARCH=x86_64
MODELNAME=“Intel(R) Xeon(R) W-2145 CPU @ 3.70GHz”
HYPERVISOR=kvm
TOTALCPU=1600
CPUSPEED=1200
TOTALMEMORY=131719688
USEDMEMORY=321904
FREEMEMORY=131397784
FREECPU=1600
USEDCPU=0
NETRX=35640648
NETTX=647040
KVM_MACHINES=“pc-i440fx-bionic ubuntu isapc pc-1.1 pc-1.2 pc-1.3 pc-i440fx-zesty pc-i440fx-2.8 pc-1.0 pc-i440fx-2.9 pc-i440fx-2.6 pc-i440fx-2.7 xenfv pc-i440fx-wily pc-i440fx-2.3 pc-i440fx-2.4 pc-i440fx-2.5 pc-i440fx-yakkety pc-i440fx-2.1 pc-i440fx-2.2 pc-i440fx-2.0 pc-q35-yakkety pc-i440fx-bionic-hpb pc-q35-2.11 q35 pc-i440fx-xenial xenpv pc-q35-2.10 pc-q35-bionic-hpb pc-q35-xenial pc-i440fx-artful pc-i440fx-1.7 pc-q35-2.9 pc-0.15 pc-i440fx-1.5 pc-q35-2.7 pc-i440fx-1.6 pc-i440fx-2.11 pc pc-q35-2.8 pc-q35-zesty pc-0.13 pc-q35-artful pc-0.14 pc-q35-2.4 pc-i440fx-trusty pc-q35-2.5 pc-q35-2.6 pc-i440fx-1.4 pc-i440fx-2.10 pc-0.11 pc-0.12 pc-q35-bionic pc-0.10”
KVM_CPU_MODELS=“486 pentium pentium2 pentium3 pentiumpro coreduo n270 core2duo qemu32 kvm32 cpu64-rhel5 cpu64-rhel6 kvm64 qemu64 Conroe Penryn Nehalem Nehalem-IBRS Westmere Westmere-IBRS SandyBridge SandyBridge-IBRS IvyBridge IvyBridge-IBRS Haswell-noTSX Haswell-noTSX-IBRS Haswell Haswell-IBRS Broadwell-noTSX Broadwell-noTSX-IBRS Broadwell Broadwell-IBRS Skylake-Client Skylake-Client-IBRS Skylake-Server Skylake-Server-IBRS athlon phenom Opteron_G1 Opteron_G2 Opteron_G3 Opteron_G4 Opteron_G5 EPYC EPYC-IBPB”
DS_LOCATION_USED_MB=2444
DS_LOCATION_TOTAL_MB=896131
DS_LOCATION_FREE_MB=848097
HOSTNAME=xxx
VM_POLL=YES
VERSION=“5.8.1”

but still ERROR in Nodes

ahuertas · July 18, 2019, 11:10am

Try to offline the hosts and the enable them back.

superxor · July 18, 2019, 11:18am

Tried that, still errors out

ahuertas · July 18, 2019, 1:29pm

Could you please send me the output of onehost show <host_id> -x?

superxor · July 18, 2019, 2:00pm

<HOST>
  <ID>53</ID>
  <NAME>10.0.3.58</NAME>
  <STATE>7</STATE>
  <IM_MAD><![CDATA[kvm]]></IM_MAD>
  <VM_MAD><![CDATA[kvm]]></VM_MAD>
  <LAST_MON_TIME>1563458327</LAST_MON_TIME>
  <CLUSTER_ID>0</CLUSTER_ID>
  <CLUSTER>default</CLUSTER>
  <HOST_SHARE>
<DISK_USAGE>0</DISK_USAGE>
<MEM_USAGE>0</MEM_USAGE>
<CPU_USAGE>0</CPU_USAGE>
<TOTAL_MEM>0</TOTAL_MEM>
<TOTAL_CPU>0</TOTAL_CPU>
<MAX_DISK>0</MAX_DISK>
<MAX_MEM>0</MAX_MEM>
<MAX_CPU>0</MAX_CPU>
<FREE_DISK>0</FREE_DISK>
<FREE_MEM>0</FREE_MEM>
<FREE_CPU>0</FREE_CPU>
<USED_DISK>0</USED_DISK>
<USED_MEM>0</USED_MEM>
<USED_CPU>0</USED_CPU>
<RUNNING_VMS>0</RUNNING_VMS>
<DATASTORES/>
<PCI_DEVICES/>
  </HOST_SHARE>
  <VMS/>
  <TEMPLATE>
<CLUSTER_ID><![CDATA[0]]></CLUSTER_ID>
<ERROR><![CDATA[Thu Jul 18 15:55:33 2019 : Error monitoring Host 10.0.3.58 (53): Timeout executing 'if [ -x "/var/tmp/one/im/run_probes" ]; then /var/tmp/one/im/run_probes kvm /var/lib/one//datastores 4124 60 53 10.0.3.58; else                              exit 42; fi']]></ERROR>
<IM_MAD><![CDATA[kvm]]></IM_MAD>
<NAME><![CDATA[10.0.3.58]]></NAME>
<RESERVED_CPU><![CDATA[]]></RESERVED_CPU>
<RESERVED_MEM><![CDATA[]]></RESERVED_MEM>
<VM_MAD><![CDATA[kvm]]></VM_MAD>
  </TEMPLATE>
</HOST>

ahuertas · July 19, 2019, 7:39am

Does ssh 10.0.3.58 work passwordless from frontend?

superxor · July 19, 2019, 1:29pm

Yes, first thing I tested

ahuertas · July 22, 2019, 7:48am

Could you try to execute the polling command but from the frontend ssh 10.0.3.58 -e <COMMAND>.

superxor · July 22, 2019, 11:17am

output is empty, still errors out

ahuertas · July 22, 2019, 11:21am

When you execute the command, you don’t get the same output as executing it on the host?

superxor · July 22, 2019, 11:27am

checked again, no I do, its same output as printed above

ahuertas · July 22, 2019, 12:35pm

What value do you have in oned.conf for MONITORING_INTERVAL_HOST?

superxor · July 23, 2019, 6:00am

the default value, didnt touch anything there

ahuertas · July 23, 2019, 7:49am

Try onehost sync --force and then offline /enable that host.

superxor · July 23, 2019, 8:28am

either it got deleted or I posted it in the wrong thrread but I had it fixed by just reinstalling the package, for some magical reason it worked, despite me attempting this before.

ahuertas · July 23, 2019, 8:40am

Perfect then!

Topic		Replies	Views
Solved: "Error monitoring Host" when trying to add host General solved	1	5491	January 29, 2019
SOLVED - Error monitoring Host - sh: /var/tmp/one/im/kvm.d/../run_probes: No such file or directory Community Support	3	2462	August 5, 2019
[SOLVED] Error monitoring Host (2): Error executing probes Community Support	9	10652	March 30, 2019
Error executing probes sometimes but not always? Community Support	1	1274	September 29, 2015
Error monitoring host Community Support	2	2967	January 1, 2016

Timeout executing 'if [ -x "/var/tmp/one/im/run_probes" ]

Related topics