Munin Monitoring

From MikroTik Wiki
Revision as of 13:20, 14 January 2009 by Savage (talk | contribs)
Jump to: navigation, search

Monitoring Mikrotik with Munin

Introduction

Munin is a very powerful, feature rich monitoring server based on Tobias Oetiker's RRDTool. The monitoring server runs every 5 minutes via cron and connect to various configured nodes. Each node runs a daemon listening for connections from the server, and executes a wide range of completely customisable scripts to return data to the munin server to generate graphs from.

As the backend graphing engine is based on RRDTool, any feature available in RRDTool is also available as options to Munin. The really nice thing about Munin and RRDTool, is that negative numbers can be graphed.

In this article I will explain how to install Munin as well as Munin-Node on a single server, and how to get Munin to probe your Mikrotik devices via SNMP as well as Telnet (Depending on the type of graph). I would strongly advise that time is spend reading the Munin as well as RRDTool documentation available at the web sites, so that a clear understanding can be obtained on how Munin operates and generates graphs.

Munin Server Installation and Configuration

Munin-Server is only available in Linux format. As I use Ubuntu and FreeBSD only, I will base this installation guide on a Ubuntu Linux server. Once the general packages has been installed, the configuration should however be pretty much the same to any Munin installation, regardless of the flavour of Linux preferred.

Installing Munin

In our case, we are going to run both munin as well as munin-node on the same machine. Untill such time that (fingers crossed) we can get a munin-node integrated into Mikrotik, the node will be required to run on the same server as Munin itself for best results.

As such, and being Ubuntu, we simply install the two packages, and make sure that we meet all the requirements in terms of dependencies. Additionally, for the Mikrotik scripts below to work, we also need to install the Net::SNMP, and Net::Telnet::Cisco Perl packages.

$sudo apt-get install munin munin-node libnet-telnet-cisco-perl libnet-snmp-perl

Now that we have all that we need installed (I am presuming you have Apache / tinyHTTP Web server already installed), it's time to head off and do some basic configurations.

Configuring the Master

First things first, we need to edit the Master's configuration file, by default, /etc/munin/munin.conf. This is the file where you configure every munin-node that the master needs to poll. As we are using a simple model here, we are only going be to polling localhost, which is accessible via 127.0.0.1 (If it isn't you have bigger problems than monitoring ;-) ).

Your Munin configuration should thus look something similar to below (Paths may vary on different distributions):

dbdir       /var/lib/munin/
htmldir     /var/www/munin/
logdir      /var/log/munin
rundir      /var/run/munin/

[localhost]
        address 127.0.0.1

dbdir will be the database where munin stores its internal state files, as well as the RRD database files
htmldir will be the directory where munin will greate the appropriate html files as well as png images
logdir keeps various log files of what munin is doing - useful when things don't go as you intended
rundir munin pid files, lock files, etc. Nothing fancy here really

Additionally, we have configured one active node which munin needs to poll. Munin will connect to 127.0.0.1 on 4949/TCP (default port that munin runs on) and pull this node for any nodes and/or scripts configured to be graphed.

Configuring the Node

Whilst Munin-Node is pretty secure out of the box, there are some basic things we need to change. Even though the node only authorizes just localhost to gather data from it, it listens by default on all IP addresses. As a security measure, we are going to alter the munin-node configuration file and ensure that we are only listening on localhost for connections from the Munin Server. As such, we need to open up /etc/munin/munin-node.conf in your faviourite editor of choice.

We need to alter the Host value in order to bind munin-node to the 127.0.0.1 address. Your config should now look like this:

#host *
host 127.0.0.1

This is about all that you need to do to get Munin working. As we have modified the configurations, we need to restart the munin-node service, in Ubuntu I issue:

$ /etc/init.d/munin-node restart

Configuring Apache

Apache needs access to Munin's htmldir configuration in order for you to see the pretty graphs and generated html in the browser of your choice. As such we need to configure some Alias and Directory settings in Apache's configuration. This can be done either inside a Virtual Host of your choice, or in apache's main configuration. As a example, I have elected to configure a new Apache Virtual Host which will only serve up the munin pages. The Virtual Host's configuration will be something similar to

<VirtualHost *:80>
        ServerAdmin webmaster@localhost
        ServerName monitor.example.com
        DocumentRoot /var/www/munin
        <Directory />
                Options FollowSymLinks
                AllowOverride None
        </Directory>
        CustomLog /var/log/apache2/monitor.example.com.access.log combined
        ErrorLog /var/log/apache2/monitor.example.com.error.log
        ServerSignature On
</VirtualHost>

Verify that your syntax of the Apache configuration file is correct (apache2ctl -t), and then restart your Apache web server to enable the newly configured Virtual Host

$ sudo apache2ctl -t
Syntax OK
$ sudo apache2ctl graceful
$

Open up your faviourite web browser, browse to http://monitor.example.com, and you should have pretty graphs for the server on which you are running munin.


Munin Node for Mikrotik

To configure Munin to monitor a Mikrotik device, is really a two step process. In a conventional configuration where Munin-Node is available on the target being monitored, we normally would only need to configure Munin to monitor the node in question, however, due to the lack of Munin on Mikrotik, we need to alias the Mikrotik node to our configured Munin-Node server in order to execute scripts on our Mikrotik device. The downside of this is that all scripts will need to be executed twice and as such scripts which use Telnet to obtain data will login two times into the routers that is being monitored.

For appropriate examples, I am going to monitor two Mikrotik routers and show the various scripts available so far. If there are any updates to these scripts, or requirements for additional scripts, please feel free to let me know either via the Forum, or via email.


Configuring additional Nodes in Munin

The first step in monitoring Mikrotik devices, is to configure the nodes as a alias to our locally running Munin-Node. This is done by editing the Munin-Node configuration file, by default located at /etc/munin/munin.conf. We edit this file using our faviourite editor of choice, and define the two new nodes as listed below.

The part in the brackets define the node's name, and for sanity purposes I recommend that the FQDN always be used to make your life allot easier in the following sections. The address, will always be pointing to 127.0.0.1 as Munin needs to connect to our only real munin-node we have running on localhost.

[node1.somewhere.com]
    address 127.0.0.1

[node2.somewhere.com]
    address 127.0.0.1


Monitoring CPU Usage

Now that we have our Munin node configured, we need to configure the scripts inside Munin-Node to poll our Mikrotik Devices. This is where things get complicated, and as a example I will configure a script to monitor and graph both our router's CPU usage. The scripts are by default located in /etc/munin/plugins/. The naming of these scripts are very important, and attention need to be given to the location as well as names.

Our CPU Monitoring script utilises SNMP, therefore, make sure that SNMP is enabled on your routers, and that the Server running munin has access to query your router via SNMP. A simple test can be performed to ensure that this is the case:

$ snmpget -v 1 -c public node1.somewhere.com .1.3.6.1.2.1.25.3.3.1.2.1
HOST-RESOURCES-MIB::hrProcessorLoad.1 = INTEGER: 5
$

Congratulations, we have just obtained our CPU usage of our router via a simple SNMP query. Should this query not be successfull, it means that Munin will be unable to query your router via SNMP, and you need to correct this before proceeding.

Now that we know SNMP queries are working like they should, we need to get a simple script operational to pull this data into Munin for monitoring... Copy the script below to query the nodes via SNMP for CPU usage, and save it in the /etc/munin/plugins directory having the specific file name of mikrotikcpu_node1.somewhere.com (where node1.somewhere.com is the same as the node name you configured earlier in /etc/munin/munin.conf). These names MUST be the same, otherwise, the scripts WILL fail.

Mikrotik CPU Usage via SNMP:

#!/usr/bin/perl
###############################################################################
use diagnostics;
use Net::SNMP;
use strict;
use warnings;
###############################################################################
my $CPUOID = ".1.3.6.1.2.1.25.3.3.1.2.1";
my $SNMPCommunity = "public";
my $SNMPPort = "161";

###############################################################################
## Determine Hostname
my $Host = undef;
$0 =~ /mikrotikcpu_(.+)*$/;
unless ($Host = $1) {
  exit 2;
}

###############################################################################
## Initiate SNMP Session
my ($Session, $Error) = Net::SNMP->session (-hostname  => $Host,
                                            -community => $SNMPCommunity,
                                            -port      => $SNMPPort,
                                            -timeout   => 60,
                                            -retries   => 5,
                                            -version   => 1);
if (!defined($Session)) {
  die "Croaking: $Error";
}

###############################################################################
## Configuration
if ($ARGV[0] && $ARGV[0] eq "config") {
  print "host_name " . $Host . "\n";
  print "graph_args -l 0 -r --vertical-label percent --lower-limit 0 --upper-limit 100\n";
  print "graph_title CPU usage\n";
  print "graph_category system\n";
  print "graph_info This graph shows the router's CPU usage.\n";
  print "graph_order Total\n";
  print "graph_vlabel %\n";
  print "graph_scale no\n";
  print "Total.label CPU Usage\n";
  print "Total.draw AREA\n";
  print "Total.warning 60\n";
  print "Total.critical 90\n";
  $Session->close;
  exit;
}

###############################################################################
## Execution
if (my $Result = $Session->get_request(-varbindlist => [$CPUOID])) {
  print "Total.value " . $Result->{$CPUOID} . "\n";
  $Session->close;
  exit;
}

Next, we need to make sure that Munin can execute the file, and that the script os operating successfully

$ chmod u+x /etc/munin/plugins/mikrotikcpu_node1.somewhere.com
$ chown munin:munin /etc/munin/plugins/mikrotikcpu_node1.somewhere.com
$ /etc/munin/plugins/mikrotikcpu_node1.somewhere.com config
host_name node1.somewhere.com
graph_args -l 0 -r --vertical-label percent --lower-limit 0 --upper-limit 100
graph_title CPU usage
graph_category system
graph_info This graph shows the router's CPU usage.
graph_order Total
graph_vlabel %
graph_scale no
Total.label CPU Usage
Total.draw AREA
Total.warning 60
Total.critical 90

Our scripts seems to be working fine. Very important, when the script is executed with the config parameter, make 100% sure that the value of the host_name configuration variable is returned correctly, and that it is identical to the name of the node configured in /etc/munin/munin.conf.

The last step is to restart our munin-node, as we have made changes to it (we added more scripts). We simply execute the restart command:

$ /etc/init.d/munin-node restart

Now that we have restarted munin-node, we can also test the configuration, and see what munin will be seeing when quering the nodes that we have configured. In order to do this, we are going to telnet into Munin-Node, and obtain the graph data ourselves. This is done by telneting to the munin-node engine, running at 127.0.0.1 port 4949 (if you didn't change any other configurations except that mentioned in this document), and we then send the 'nodes' command, which will give us a list of nodes running under the munin-node that we have configured.

$ telnet localhost 4949
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
# munin node at localhost.somewhere.com
nodes
localhost
node1.somewhere.com
node2.somewhere.com
.

It seems to be perfect. We have localhost, node1.somewhere.com, as well as node2.somewhere.com. Let's see what plugins is available under each node. Again, we telnet into the munin-node deamon, but this time, we will execute two commands, 'list node1.somewhere.com' and 'list node2.somewhere.com'. This will list all the plugins currently configured for these two nodes.

$ telnet 127.0.0.1 4949
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
# munin node at localhost.somewhere.com
list node1.somwehere.com
mikrotikcpu_node1.somewhere.com
list node2.somewhere.com
mikrotikcpu_node2.somewhere.com

We can see in both instances, the mikrotikcpu_ script has been configured, and is activated for both our mikrotik nodes. We can also attempt to fetch the data by executing a 'fetch mikrotikcpu_node1.somewhere.com' command in munin, at which time munin-node will query the Mikrotik Router, and return the data back through Munin.

If all went well, you should by now have two additional nodes listed on the Web Interface munin has, and both these two additional nodes should have graphs indicating the CPU usage as a % value, between 0 and 100.