This script makes pretty easy to monitor the redis replication lag in nagios.

The idea is just to call the redis-cli script and grab the output of the master_last_io_seconds_ago variable.

Once you get that, we just check and return the right value (according to the nagios manual)

There are other scripts on the wild, but the idea behind this one was to avoid any dependency…since you know…you already have BASH ;-)

#!/bin/bash

PORT=$1

WARNING_LIMIT_LAG=$2
CRITICAL_LIMIT_LAG=$3


REPLICATION_LAG=`/usr/local/bin/redis-cli -p $PORT info | grep master_last_io_seconds_ago | awk -F: '{printf "%d", $2}'`


if [ -z "$REPLICATION_LAG" ]; then
	echo -n "Unknown: Could not get replication lag information."
	exit 3
fi

if [ $REPLICATION_LAG -ge $CRITICAL_LIMIT_LAG ]; then
	echo -n "Critical: Current replication lag is $REPLICATION_LAG seconds. Critical limit: $CRITICAL_LIMIT_LAG seconds."
	exit 2
fi

if [ $REPLICATION_LAG -ge $WARNING_LIMIT_LAG ]; then
	echo -n "Warning: Current replication lag is $WARNING_LIMIT_LAG seconds. Warning limit: $WARNING_LIMIT_LAG seconds."
	exit 1
fi

echo -n "Ok: Current replication lag is $REPLICATION_LAG seconds."
exit 0

The script is pretty naive and assumes that everything will work ok, so there are several things to improve, for example:

  • Making script arguments non-positional.
  • Support other hosts.
  • etc.

Enjoy