wiki:NagiosMonitoring

Version 3 (modified by joshuadf, 7 years ago) (diff)

--

Nagios Quick Start Guide

On el6, a true quick start. Precompiled Nagios is available via EPEL and set to run with just a few simple changes. No need to read about how to run compile make make install.

yum -y install nagios nagios-plugins-all

Just edit the email address in /etc/nagios/objects/contacts.cfg and start nagios, though by default nagios will only monitor localhost.

To monitor a remote host, here's details for ping and ssh checks (put in localhost.cfg or create a new myhost.cfg file and edit /etc/nagios/nagios.cfg to remove localhost.cfg):

define host{
        use                     linux-server            ; Name of host template to use
        host_name               myhost.biostr.washington.edu
        alias                   myhost.biostr.washington.edu
        address                 123.45.67.89
        }

define hostgroup{
        hostgroup_name  linux-servers ; The name of the hostgroup
        alias           Linux Servers ; Long name of the group
        members         myhost.biostr.washington.edu     ; Comma separated list of hosts that belong to this group
        }

define service{
        use                             local-service         ; Name of service template to use
        host_name                       myhost.biostr.washington.edu
        service_description             PING
        check_command                   check_ping!100.0,20%!500.0,60%
        }

define service{
        use                             local-service         ; Name of service template to use
        host_name                       myhost.biostr.washington.edu
        service_description             SSH
        check_command                   check_ssh
        }

If you want to use the web interface at http://your-nagios-server.tld/nagios/ then you also need to create a htpasswd file, default user is "nagiosadmin":

htpasswd -c /etc/nagios/passwd nagiosadmin

You also need to install php and restart Apache.

From there you can read the nagios docs for more details and more complex setups. In particular you might want to add or even write your own custom service checks, such as for VmwareEsxi.

Custom nagios service checks

A few supporting libraries I've installed for various purposes:

perl-Expect
perl-libwww-perl
perl-SOAP-Lite
pywbem
SOAPpy

http://nagiosplug.sourceforge.net/developer-guidelines.html

I tend to use the same template over and over, for example here is a nagios plugin to monitor a Java RMI service. Instead of writing the whole thing in Java I just fire off a jar from this nagios plugin template perl script:

#! /usr/bin/perl -w
# http://nagiosplug.sourceforge.net/developer-guidelines.html#PLUGOUTPUT

use strict;
use Getopt::Long;
use vars qw($opt_V $opt_h $opt_w $verbose $opt_c $PROGNAME $REVISION $line $pattern);
use lib "." ;
use lib "/usr/lib/nagios/plugins/" ;
use utils qw(%ERRORS &print_revision &support &usage);
use Expect;

$PROGNAME = "check_";
$REVISION = '$ 1.1 $';

sub print_usage () { print "Usage: $PROGNAME \n"; }

sub print_help () {
   print_revision($PROGNAME, $REVISION);
   print " This plugin ";
        print_usage();
        #support();
}

$ENV{'PATH'}='/usr/bin/';
$ENV{'BASH_ENV'}=''; 
$ENV{'ENV'}='';

Getopt::Long::Configure('bundling');
GetOptions
        ("V"   => \$opt_V, "version"    => \$opt_V,
         "h"   => \$opt_h, "help"       => \$opt_h,
         "w=s" => \$opt_w, "warning=s"  => \$opt_w,
         "c=s" => \$opt_c, "critical=s" => \$opt_c,
         "v"   => \$verbose, "verbose"  => \$verbose);

if ($opt_V) {
        print_revision($PROGNAME, $REVISION);
        exit $ERRORS{'OK'};
}

if ($opt_h) {print_help(); exit $ERRORS{'OK'};}

# =============================================================================

# EDIT ME: below are the params for the Web Service Call


my $result_check = "all good";
my $result_message = "Good results";

my @hosts = ("uvula","femur","orbit");
my $result_string;
my $ok = 0;

foreach my $host (@hosts) {
        $result_string .= "Host " . $host . ": ";
        open(MINDSEER,"/usr/bin/java -jar /usr/lib/nagios/plugins/sig/CheckAlive.jar $host 2>&1 |") || die "Can't open";
        while (<MINDSEER>) { $result_string .= $_; } 
        close(MINDSEER);
}

## poor man's check... one string full of messages from all hosts
$_ = $result_string;
# =============================================================================
  if (/java.rmi.NotBoundException/) {
    print "Error: MindSeer Visualizer not running!\n";
    print "Error:\n$_\n" if ( defined $verbose );
    exit $ERRORS{'CRITICAL'};
  }
  elsif (/java.net.NoRouteToHostException/) {
    print "Error: Host down or RMI not running or firewalled!\n";
    print "Error:\n$_\n" if ( defined $verbose );
    exit $ERRORS{'CRITICAL'};
  }
  elsif (/java.rmi.ConnectException/) {
    print "Error: Connection refused\n";
    print "Error:\n$_\n" if ( defined $verbose );
    exit $ERRORS{'CRITICAL'};
  }
  elsif (/java.rmi.UnknownHostException/) {
    print "Error: No such hostname!\n";
    print "Error:\n$_\n" if ( defined $verbose );
    exit $ERRORS{'CRITICAL'};
  }
  elsif (!defined $_) {
    print "Error: Empty results!\n";
    exit $ERRORS{'CRITICAL'};
  }
  elsif (/$result_check/) {  
    print "$result_message\n";
    print "Results:\n$_\n" if ( defined $verbose );
    exit $ERRORS{'OK'};
  }
  else {
    print "Unknown Error $_!\n";
    print "Results:\n$_\n" if ( defined $verbose );
    exit $ERRORS{'CRITICAL'};
  }
  

For a SOAP service, add use SOAP::Lite; at the top (or use SOAP::Lite+trace; for debugging) and change the part below "EDIT ME" with some service-specific code:

# EDIT ME: below are the params for the Web Service Call

my $query = 'WHERE X->bar CREATE foo(bar)';
my $result_check = "Heart";
my $result_message = "Good results for NAME->Heart!";

eval {
$_ = SOAP::Lite
  -> service("http://fma.biostr.washington.edu:8080/OqafmaWebService/wsdl/OqafmaWS.wsdl")
  -> runQuery("$query");
};
# =============================================================================
if ($@) {
  print "Error: Web Service call failed!\n";
  print "Error:\n$@\n" if ( defined $verbose );
  exit $ERRORS{'CRITICAL'};
}
elsif (!defined $_) {
  print "Error: Empty results!\n";
  exit $ERRORS{'CRITICAL'};
}
elsif (/$result_check/) {  
  print "$result_message\n";
  print "Results:\n$_\n" if ( defined $verbose );
  exit $ERRORS{'OK'};
}
else {
  print "Error: Illegal results!\n";
  print "Results:\n$_\n" if ( defined $verbose );
  exit $ERRORS{'CRITICAL'};
}