Monitoring ONe with Nagios?

Yenya · September 29, 2016, 7:25am

Hello,

do you monitor the status of OpenNebula with a monitoring tool like Nagios? If so, which parts of OpenNebula state do you watch and with which tool?

I would like to be informed about general problems which require sysadmin attention (hosts Error state, images in Error state, scheduling problems due to resource exhaustion, etc.). I would like to have a single three-state value “ONe is OK/WARNING/CRITICAL”.

My language of choice is Perl, and I have discovered that the Net::OpenNebula CPAN module works for me, so I can write something myself. But I would like to look at existing solutions first.

Thanks,

-Yenya

darkfader · October 1, 2016, 8:03pm

Hi,

I’ve made checks a few years back and there’s also a few more that already existed:

My checks are primarly for Check_MK, I don’t use plain Nagios anymore, since very long.

The code should be very easy to read for you anyway.

The other, API-based check is equally important. It seems to have been renamed, so I have no current info about it.

I found a copy of the original repo here:

darkfader · October 1, 2016, 8:13pm

One thing I didn’t answer (sorry) was the single tri-state thing.
I would advise against that considering you have literally 100s of components behind a ONE cluster.

It’s possible if you use one of the various business intelligence addons for Nagios, but even those require you to have all the single checks and then tie a large global green/yellow/red thing out of them.

If you make your own check to have such a summary status, make sure you have a strong separation between WARN (look at it, things might still be working though) and CRIT (things are dead, drop your drink NOW)

paulmarc · November 24, 2016, 4:04pm

Hi @darkfader,

Thank you for the great check_mk scripts.
I just need some pointers on how to install them.

Any tips?

darkfader · December 11, 2016, 9:11pm

Hi,

case1:
you run check_mk already or can use it
-> look at the check_mk documentation about plugins.
generally install things from plugins or local to /usr/lib/check_mk_agent/plugins or /usr/lib/check_mk_agent/local on the systems.
(Some checks are for the frontend, some for the nodes)
the things in the checks folder need to go to local/share/check_mk/checks on your Check_MK server.

case2:
you don’t run check_mk and want to stick with nagios
You’ll need to modify the checks to have output that matches with the Nagios plugin standard. It’s not very hard once you did it a few times.

I would recommend case one, maybe half a day and you have most important parts of monitoring, or about 3 days if you know very little about Nagios/Check_MK yet.

Here’s some material I wrote for people who took my class on Check_MK (and OpenNebula) back in May:

http://confluence.wartungsfenster.de/display/Adminspace/OpenNebula+Monitoring+with+Check_MK

Topic		Replies	Views
Icinga/Nagios Monitoring Product Support	1	794	July 10, 2018
What's wrong? maybe a bug? Product Support	7	715	April 26, 2017
Long term statistics and capacity management for OpenNebula Clouds Integration Support	7	2756	August 1, 2016
OpenNebula Large Deploymets question Product Support	4	1332	December 4, 2017
How to monitor Virtual Machines on OpenNebula via CLI? Integration Support	2	1111	January 5, 2016

Monitoring ONe with Nagios?

Related topics