Automatic Downtimes in Nagios Without Using Time Periods
Monitoring systems are a great thing, and we rely heavily on Nagios here at PalominoDB. We also rely heavily on xtrabackup for (mostly) non-locking, "warm" backups of MySQL. In order to get a consistent backup with a proper binlog position on a slave, xtrabackup stops replication for a short period of time. If the monitoring system catches this, the pager will go off, and usually in the middle of the night.
Nagios can have specific windows when it is on or off for a particular host or service, but you have to remember that when you change the time the backup runs. I prefer to call a script from cron, just before I call the backup script, so that I can easily see that they are related.
Note that this script works even if you have Nagios password protected with .htaccess, because wget allows for a username and password. This script is not perfect - there is not a lot of error checking, and log files are hard-coded. But it does what it needs to.
The script is called like this:
./scheduleDowntimeByService service length host
Sample cron entries (note that the service name depends on what you are checking and the service names themselves):
00 5 * * 0 /home/palominodb/bin/scheduleDowntimeByService.pl MySQL+Replication+Lag 3600 hostname
00 5 * * 0 /home/palominodb/bin/scheduleDowntimeByService.pl MySQL+Slave+IO+Thread 3600 hostname
00 5 * * 0 /home/palominodb/bin/scheduleDowntimeByService.pl MySQL+Slave+SQL+Thread 3600 hostname
Here is the script itself:
cat scheduleDowntimeByService.pl
#!/usr/bin/perl
use strict;
use POSIX;
my $service = shift @ARGV;
my $length = shift @ARGV;
my $host = shift @ARGV;
unless ($length ) {
$length = 3600;
}
my $startTime = time();
my $endTime = $startTime + $length;
my $nagios_start = POSIX::strftime("%m-%d-%Y %H:%M:00", localtime($startTime));
my $nagios_end = POSIX::strftime("%m-%d-%Y %H:%M:00", localtime($endTime));
$nagios_start =~ s@:@%3A@g;
$nagios_start =~ s@-@%2D@g;
$nagios_start =~ s@ @%20@g;
$nagios_end =~ s@:@%3A@g;
$nagios_end =~ s@-@%2D@g;
$nagios_end =~ s@ @%20@g;
my $URL = 'https://monitoring.company.com/nagios//cgi-bin/cmd.cgi?fixed=1&start_time=' . $nagios_start . '&end_time=' . $nagios_end . '&cmd_typ=56&cmd_mod=2&host=' . $host . '&service=' . $service . '&com_data=Backups&btnSubmit=Commit';
my $cmd = "/usr/bin/wget --quiet --user=user --password=PASS -O /tmp/nagios_downtime.html '$URL'"
;
open ( L, ">>/tmp/nagios_downtime.log" );
print L print $cmd . "\n";
print L `$cmd`;
close L;
Archives
- May 2012
- April 2012
- March 2012
- February 2012
- January 2012
- December 2011
- November 2011
- October 2011
- September 2011
- August 2011
- July 2011
- June 2011
- May 2011
- April 2011
- March 2011
- February 2011
- January 2011
- December 2010
- November 2010
- October 2010
- September 2010
- November 2009
- March 2008
- November 2007
- October 2007


Comments
Reply