So you’ve got Ubuntu 11.04 and think you’ll use Nagios and its fun plugin, NRPE, to monitor your cloud servers, right? Sure! There are some gotchas though. Read on. Note: You may want to check out check_mk instead of NRPE as it is reportedly sexier.
For readability I’ll start with the remote server. It doesn’t matter so much except that obviously the last command where you connect to the remote server won’t work until you’ve set up the remote server.
On the remote server:
apt-get install nagios-nrpe-server
vim /etc/nagios/nrpe.cfg
Edit the allowed_hosts line as follows, replacing 1.2.3.4 with your monitoring server’s IP. Be mindful of the lack of spaces:
allowed_hosts=127.0.0.1,1.2.3.4
Edit the command lines at the bottom:
command[check_hda1]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /dev/hda1
You may want to change both instances of hda1 above to sda1 depending on your config. Run the df command to see your partitions.
service nagios-nrpe-server restart
On the monitoring server:
apt-get install nagios3
It’ll ask you to choose a password and email method. I chose Internet server for mail method and a random password.
vim /var/www/index.html
# You may want to clear out this file or change it to a more suitable index. Default splash pages annoy me.
cd /etc/nagios3/conf.d/
cp localhost_nagios2.cfg REMOTESERVER_nagios2.cfg
Replace REMOTESERVER with the remote server’s hostname.
vim REMOTESERVER_nagios2.cfg
REMOTESERVER_nagios2.cfg
define host{
use generic-host
host_name REMOTESERVER.EXAMPLE.COM ; Change to the remote server's hostname and domain name.
alias REMOTESERVER ; Change to the remote server's hostname
address 5.6.7.8 ; Change to the remote server's IP address (WAN?)
}
define service{
use generic-service
host_name REMOTESERVER.EXAMPLE.COM ; Change to the remote server's hostname and domain name.
service_description Disk Space 1
check_command check_nrpe!check_sda1
}
define service{
use generic-service
host_name REMOTESERVER.EXAMPLE.COM ; Change to the remote server's hostname and domain name.
service_description Current Users
check_command check_nrpe!check_users
}
define service{
use generic-service
host_name REMOTESERVER.EXAMPLE.COM ; Change to the remote server's hostname and domain name.
service_description Total Processes
check_command check_nrpe!check_total_procs
}
define service{
use generic-service
host_name REMOTESERVER.EXAMPLE.COM ; Change to the remote server's hostname and domain name.
service_description Current Load
check_command check_nrpe!check_load
}
#### End REMOTESERVER_nagios2.cfg
vim hostgroups_nagios2.cfg
hostgroups_nagios2.cfg
You likely have ssh running on your remote server, so go ahead and add it to the ssh-servers hostgroup. You can add it to others too.
define hostgroup {
hostgroup_name ssh-servers
alias SSH servers
members localhost, REMOTESERVER.EXAMPLE.COM ; Change to the remote server's hostname and domain name, being careful to keep the space after the comma
}
#### End hostgroups_nagios2.cfg
service nagios3 restart
/usr/lib/nagios/plugins/check_nrpe -H REMOTESERVER.EXAMPLE.COM
Ideally it’ll respond with NRPE v2.12 or similar.
And you should be done!
Open http://MONITORINGSERVER.EXAMPLE.COM/nagios3 and login with the password you created when you installed Nagios, username is nagiosadmin.
There are a ton of customizations you can do, such as setting parents, customizing icons, installing pnp4nagios, and changing the notifications.
Hi i always get this error in my web-frontent: (Return code of 127 is out of bounds – plugin may be missing)
Remoteserver is a debian 7 server Nagios-server is a ubuntu 13 server.
you define hda1 and sda1 – 🙂
Will U Are AWESOME! Thank you buddy! God Bless You! Works like a charm!