Sunstone - unable to show (physical) host information

bart · March 17, 2015, 11:56am

Hi,

We’re experiencing something werid (again…) with the OpenNebula Sunstone.

First, let me show you the commandline equivilant of what I’m doing (just to make sure it’s all clear):

As a first step welist all hosts:

$ onehost list
  ID NAME            CLUSTER   RVM      ALLOCATED_CPU      ALLOCATED_MEM STAT
   9 sf01.**** Cluster     0       0 / 800 (0%)    0K / 31.3G (0%) on
  11 sf02.**** Cluster     0       0 / 800 (0%)    0K / 31.3G (0%) on
  12 sf03.**** Cluster     0       0 / 800 (0%)    0K / 31.3G (0%) on
  13 sf04.**** Cluster     0       0 / 800 (0%)    0K / 31.3G (0%) on

Then we show the information regarding one specific host:

$ onehost show 9
HOST 9 INFORMATION
ID                    : 9
NAME                  : sf01.****
CLUSTER               : Cluster
STATE                 : MONITORED
IM_MAD                : kvm
VM_MAD                : kvm
VN_MAD                : dummy
LAST MONITORING TIME  : 03/17 12:44:18

(and so on)

Up to this point it all works nicely.

However, when we use the Sunstone interface the following happens:

The host listing is shown, instantly + update status and all.
When we now click a host to show the specific information we get the loading page and at that point nothing happens… It’s just stuck.

Looking at the logging we can see that the Get request comes through to the Sunstone. However we don’t get any erorrs. Monitoring the ond.log also doesn’t give anything to work with.

The funny part however is that when we first add a host (using sunstone), and immediatly disable that host, then we can view the host info (if we’re quick). Hosts that have been monitored can’t be viewed (even after disabling).

At first we thought it was our firewall. But all opennebula components are on the same machine + if we’re quick we can get some information (eventually). But after a host is properly added (thus, status on) it will simply fail to display in the sunstone.

Does anyone know where I should start looking? Debug logging is enabled for sunstone and oned but we don’t get any information regarding errors (seems like the requests disappears in /dev/null or something…).

Any insight on this matter would be greatly appreciated!!!

Thanks in advance,

Bart

bart · March 17, 2015, 12:03pm

Maybe the full hos tinformation is important, below you’ll find the host along with all it’s attributes.

$ onehost show 9
HOST 9 INFORMATION
ID                    : 9
NAME                  : sf01.****
CLUSTER               : Cluster
STATE                 : MONITORED
IM_MAD                : kvm
VM_MAD                : kvm
VN_MAD                : dummy
LAST MONITORING TIME  : 03/17 12:44:18

HOST SHARES
TOTAL MEM             : 31.3G
USED MEM (REAL)       : 923.4M
USED MEM (ALLOCATED)  : 0K
TOTAL CPU             : 800
USED CPU (REAL)       : 12
USED CPU (ALLOCATED)  : 0
RUNNING VMS           : 0

MONITORING INFORMATION
ARCH="x86_64"
ARCH="x86_64"
ARCH="x86_64"
ARCH="x86_64"
CPUSPEED="2003"
CPUSPEED="2003"
CPUSPEED="2003"
CPUSPEED="2003"
HOSTNAME="sf01.****"
HOSTNAME="sf01.****"
HOSTNAME="sf01.****"
HOSTNAME="sf01.****"
HYPERVISOR="kvm"
HYPERVISOR="kvm"
HYPERVISOR="kvm"
HYPERVISOR="kvm"
MODELNAME="Intel(R) Xeon(R) CPU           X5355  @ 2.66GHz"
MODELNAME="Intel(R) Xeon(R) CPU           X5355  @ 2.66GHz"
MODELNAME="Intel(R) Xeon(R) CPU           X5355  @ 2.66GHz"
MODELNAME="Intel(R) Xeon(R) CPU           X5355  @ 2.66GHz"
NETRX="5079012522"
NETRX="5079041539"
NETRX="5079148700"
NETRX="5079166204"
NETTX="5367373336"
NETTX="5367414144"
NETTX="5367561384"
NETTX="5367583010"
RESERVED_CPU=""
RESERVED_MEM=""
VERSION="4.12.0"
VERSION="4.12.0"
VERSION="4.12.0"
VERSION="4.12.0"

VIRTUAL MACHINES

    ID USER     GROUP    NAME            STAT UCPU    UMEM HOST             TIME

dmolina · March 17, 2015, 12:03pm

Could you check if there is any error in the browser developer console?

bart · March 17, 2015, 12:22pm

Good tip! This does indeed give me some errors.

The following error is displayed:

TypeError: host_info.TEMPLATE.HYPERVISOR.toLowerCase is not a function hosts-tab.js:781:41

Which corresponds with the following code (the error console brings me to the bold line):

// Get rid of the unwanted (for show) HOST keys
var stripped_host_template = {};
var unshown_values         = {};

**if (host_info.TEMPLATE.HYPERVISOR && host_info.TEMPLATE.HYPERVISOR.toLowerCase() != "vcenter")**
{
  stripped_host_template = host_info.TEMPLATE;
}
else
{
  for (key in host_info.TEMPLATE)
      if(!key.match(/HOST/))
          stripped_host_template[key]=host_info.TEMPLATE[key];
      else
          unshown_values[key]=host_info.TEMPLATE[key];
}

I’m lost at this point, does this mean I’m missing something in my installation/configuration?

dmolina · March 17, 2015, 1:47pm

I think the problem is with the repeated keys in the monitoring information. Could you check if there are no duplicated probes in var/lib/one/remotes/im/kvm-probes.d (front end) or /var/tmp/one/im/kvm-probes.d (nodes). There should be only a key=value per attribute in the host monitoring information.

bart · March 17, 2015, 2:20pm

I’ve checked that directory and it turned out there were a bunch of .rpmsave files.

1.5K -rwxr-xr-x. 1 oneadmin oneadmin 1.2K Mar 10 00:43 architecture.sh
1.5K -rwxr-xr-x. 1 oneadmin oneadmin 1.2K Jan 15 17:26 architecture.sh.rpmsave
1.5K -rwxr-xr-x. 1 oneadmin oneadmin 1.4K Mar 10 00:43 collectd-client-shepherd.sh
1.5K -rwxr-xr-x. 1 oneadmin oneadmin 1.4K Jan 15 17:26 collectd-client-shepherd.sh.rpmsave
1.5K -rwxr-xr-x. 1 oneadmin oneadmin 1.4K Mar 10 00:43 cpu.sh
1.5K -rwxr-xr-x. 1 oneadmin oneadmin 1.4K Jan 15 17:26 cpu.sh.rpmsave
3.5K -rwxr-xr-x. 1 oneadmin oneadmin 3.2K Mar 10 00:43 kvm.rb
3.5K -rwxr-xr-x. 1 oneadmin oneadmin 3.2K Jan 15 17:26 kvm.rb.rpmsave
2.5K -rwxr-xr-x. 1 oneadmin oneadmin 2.2K Mar 10 00:43 monitor_ds.sh
2.5K -rwxr-xr-x. 1 oneadmin oneadmin 2.2K Jan 15 17:26 monitor_ds.sh.rpmsave
1.5K -rwxr-xr-x. 1 oneadmin oneadmin 1.2K Mar 10 00:43 name.sh
1.5K -rwxr-xr-x. 1 oneadmin oneadmin 1.2K Jan 15 17:26 name.sh.rpmsave
1.5K -rwxr-xr-x. 1 oneadmin oneadmin 1.2K Mar 10 00:43 poll.sh
1.5K -rwxr-xr-x. 1 oneadmin oneadmin 1.2K Jan 15 17:26 poll.sh.rpmsave
1.5K -rwxr-xr-x. 1 oneadmin oneadmin 1.3K Mar 10 00:43 version.sh
1.5K -rwxr-xr-x. 1 oneadmin oneadmin 1.3K Jan 15 17:26 version.sh.rpmsave

So I’ve searched for those and removed all of them from /var/lib/one:

cd /var/lib/one
find . -name *.rpmsave | xargs rm

The frontend and KVM clients should be the same since they have the same /var/lib/one (stored on a glusterfs volume).

After that I restarted OpenNebula and did the test again.

But sadly still no luck.

I also tested removing a host and then adding it again to see if it would registrer nicely, but that didn’t seem to make any difference. The error console still gives the same error.

dmolina · March 17, 2015, 2:25pm

Could you send us the output of one of the hosts that is giving you this error:

onehost show <HOST_ID> -x

bart · March 17, 2015, 2:27pm

Here’s the output:

<HOST>
  <ID>16</ID>
  <NAME>sf02</NAME>
  <STATE>2</STATE>
  <IM_MAD><![CDATA[kvm]]></IM_MAD>
  <VM_MAD><![CDATA[kvm]]></VM_MAD>
  <VN_MAD><![CDATA[dummy]]></VN_MAD>
  <LAST_MON_TIME>1426602293</LAST_MON_TIME>
  <CLUSTER_ID>100</CLUSTER_ID>
  <CLUSTER>Cluster</CLUSTER>
  <HOST_SHARE>
    <DISK_USAGE>0</DISK_USAGE>
    <MEM_USAGE>0</MEM_USAGE>
    <CPU_USAGE>0</CPU_USAGE>
    <MAX_DISK>13726726</MAX_DISK>
    <MAX_MEM>32778788</MAX_MEM>
    <MAX_CPU>800</MAX_CPU>
    <FREE_DISK>13722731</FREE_DISK>
    <FREE_MEM>32184836</FREE_MEM>
    <FREE_CPU>796</FREE_CPU>
    <USED_DISK>3996</USED_DISK>
    <USED_MEM>593952</USED_MEM>
    <USED_CPU>3</USED_CPU>
    <RUNNING_VMS>0</RUNNING_VMS>
    <DATASTORES/>
  </HOST_SHARE>
  <VMS/>
  <TEMPLATE>
    <ARCH><![CDATA[x86_64]]></ARCH>
    <ARCH><![CDATA[x86_64]]></ARCH>
    <ARCH><![CDATA[x86_64]]></ARCH>
    <ARCH><![CDATA[x86_64]]></ARCH>
    <CPUSPEED><![CDATA[2003]]></CPUSPEED>
    <CPUSPEED><![CDATA[2003]]></CPUSPEED>
    <CPUSPEED><![CDATA[2003]]></CPUSPEED>
    <CPUSPEED><![CDATA[2003]]></CPUSPEED>
    <HOSTNAME><![CDATA[sf02]]></HOSTNAME>
    <HOSTNAME><![CDATA[sf02]]></HOSTNAME>
    <HOSTNAME><![CDATA[sf02]]></HOSTNAME>
    <HOSTNAME><![CDATA[sf02]]></HOSTNAME>
    <HYPERVISOR><![CDATA[kvm]]></HYPERVISOR>
    <HYPERVISOR><![CDATA[kvm]]></HYPERVISOR>
    <HYPERVISOR><![CDATA[kvm]]></HYPERVISOR>
    <HYPERVISOR><![CDATA[kvm]]></HYPERVISOR>
    <MODELNAME><![CDATA[Intel(R) Xeon(R) CPU           X5355  @ 2.66GHz]]></MODELNAME>
    <MODELNAME><![CDATA[Intel(R) Xeon(R) CPU           X5355  @ 2.66GHz]]></MODELNAME>
    <MODELNAME><![CDATA[Intel(R) Xeon(R) CPU           X5355  @ 2.66GHz]]></MODELNAME>
    <MODELNAME><![CDATA[Intel(R) Xeon(R) CPU           X5355  @ 2.66GHz]]></MODELNAME>
    <NETRX><![CDATA[1942119537]]></NETRX>
    <NETRX><![CDATA[1942137042]]></NETRX>
    <NETRX><![CDATA[1942201784]]></NETRX>
    <NETRX><![CDATA[1942220127]]></NETRX>
    <NETTX><![CDATA[1594807268]]></NETTX>
    <NETTX><![CDATA[1594829104]]></NETTX>
    <NETTX><![CDATA[1594916794]]></NETTX>
    <NETTX><![CDATA[1594938804]]></NETTX>
    <RESERVED_CPU><![CDATA[]]></RESERVED_CPU>
    <RESERVED_MEM><![CDATA[]]></RESERVED_MEM>
    <VERSION><![CDATA[4.12.0]]></VERSION>
    <VERSION><![CDATA[4.12.0]]></VERSION>
    <VERSION><![CDATA[4.12.0]]></VERSION>
    <VERSION><![CDATA[4.12.0]]></VERSION>
  </TEMPLATE>
</HOST>

dmolina · March 17, 2015, 2:33pm

Could you try to:

Remove the *.rpmsave files from /var/lib/one/remotes/im/kvm-probes.d
Remove the content of the /var/tmp/one directory in the nodes
Execute the onehost sync --force from the frontend
Recreate the host

bart · March 17, 2015, 2:48pm

Already did remove the rpmsave files.

I then cleared all /var/tmp/one directories.
Then forced a sync, but this gave an error on a few hosts. Those that didn’t give an error seemed to work afterwards.
So as a last step I removed all hosts and added them again, after this step everything was working.

So I guess the main problem was the /var/tmp/one directory?

Anyways, thanks allot for the quick help!!!

Topic		Replies	Views
[SOLVED] Sunstone is loading to infinity Installation & Configuration solved	16	1387	December 18, 2019
VMs not visible in Sunstone after upgrade to 5.0.2 Product Support	4	982	July 22, 2016
Host status is ERROR Installation & Configuration	1	505	January 7, 2019
Sunstone - Time out errors Product Support	5	2210	November 26, 2016
[SOLVED] Error monitoring Host (2): Error executing probes Product Support	9	10754	March 30, 2019

Sunstone - unable to show (physical) host information

Related topics