Monitoring Hadoop with Munin
This document is based on using Debian; some information could be different in a different distribution.
Features
Munin is a very flexible and powerful monitoring tool and framework written mainly in Perl for analyzing resources like CPU, memory, hard disks, networks, and services. Munin has two components:
- Munin - the server (aka Grapher/Gatherer), which collects all the data and generates graphs.
- Munin-Node - the client, which tracks the data from the machine and sends it to the server.
Munin runs on every machine that supports Perl. The database of Munin is based on RRD (Round-Robin-Database).
Requirements
- Both
- Operating system which supports and has Perl installed (must be POSIX compatible)
- Server
- A configured Webserver/Webapp-Framework, where Munin stores the graphs and web interface.
Installation
Munin is on many Linux distributions installable through a package manager suck as apt, yum, yast, or zypper, This section explains how it works on a Debian 5 (Lenny) machine.
Munin (the server)
- Install the server where the Datameer X distribution is located using the command:
% sudo apt-get install munin
or on Fedora:
% sudo yum install munin
This installs the server and the client. If you don't want the client on the monitoring machine, then remove munin-node
from autostart configuration, such as inittab
or rc.d
(based on distribution). On some distributions there are no prepackages for Munin, you need to install them manually.
After installation, Munin created the graphs/web interface at the default location,
htmldir
 which can be configured in/etc/munin/munin.conf
. The defaulthtmldir
is/var/www/munin
. If this folder hasn't been created, then wait a while, because the Munin-Server is not a listener which runs as process; it's a script that is executed through Cronjob. You can run the cronjob-run manually as the root user using the following command:% su -c /usr/bin/munin-cron munin
If the Script does not return information usin stdout, then everything should be OK.
Now, you can access Munin through any browser.Â
http://SERVERADDRESS/munin
You should see "Munin :: Overview" with a list of machines which will be monitored. In this example, you should see the machine you configured as the Munin-Server itself.
Munin-Node (the client)
Install the client where the Datameer X distribution is located, using the command:
% sudo apt-get install munin-node
or on Fedora:
% sudo yum install munin-node
- Next, run a process called
munin-node
. Munin-Node uses port 4949 by default. If you installed the server and the client on the same machine, you should see graphs in the web interface which are updated every five minutes.
Resources
Locations of the data and configuration, based on the default configuration.
File/Folder | Description |
---|---|
| Munin-Cronjob (Server) |
| Munin-Node-Cronjob (Client) |
| Control-Script for Munin-Node (Client) |
| Logrotator-Script for Munin (Server) |
| Logrotator-Script for Munin-Node (Client) |
| Configuration-Folder |
| Munin-Configuration (Server) |
| Munin-Node-Configuration (Client) |
| Plugin-Configurations |
| The Plugins |
| Templates for the Web-Interface |
| Autorun: Stop Munin-Node (Shutdown) |
| Autorun: Stop Munin-Node (Localmode) |
| Autorun: Start Munin-Node (Runlevel 2) |
| Autorun: Start Munin-Node (Runlevel 3) |
| Autorun: Start Munin-Node (Runlevel 4) |
| Autorun: Start Munin-Node (Runlevel 5) |
| Autorun: Stop Munin-Node (Restart) |
| The Munin-Cronjob-Script (Server) |
| Shows POD-Documentation for the Plugins of Munin |
| CGI-Script which creates the graphs |
| Munin-Node-Program (Client) |
| Munin-Node-Configurator |
| Munin-Node-Configurator for SNMP |
| Location where the data are stored (RRD) |
| Location where the log files of Munin are stored |
Manage Service
Action | Debian | Fedora |
---|---|---|
Node Start | Execute % /etc/init.d/munin-node start | Execute % /sbin/service munin-node start |
Node Status | Execute % /etc/init.d/munin-node status | Execute % /sbin/service munin-node status |
Node Stop | Execute % /etc/init.d/munin-node stop | Execute % /sbin/service munin-node stop |
Node Restart | Execute % /etc/init.d/munin-node restart | Execute % /sbin/service munin-node restart |
Add Node Autostart | Move | Execute % ntsysv Check 'munin-node' and press 'OK' |
Remove Node Autostart | Move | Execute % ntsysv Uncheck 'munin-node' and press 'OK' |
Configuration
Munin (the server)
By default, you can find the configuration at /etc/munin/munin.conf
Parameter | Default value | Possible values | Description |
---|---|---|---|
| /var/lib/munin | Filesystem-Folder | Location of the RRD-Database |
| /var/www/munin | Filesystem-Folder | Location where the web interface is stored. (Should be accessible through HTTP) |
| /var/log/munin | Filesystem-Folder | Location of the log files |
| /var/run/munin | Filesystem-Folder | Location of Process-State-Files, such as PID |
| /etc/munin/templates | Filesystem-Folder | Location of the templates which are used by the web interface |
You can define the structure of monitored machines for the web interface. (The format looks similar to INI-Configuration-sections.)
For example:
[localhost.localdomain] address 127.0.0.1 use_node_name yes
Use a group called localdomain
and associate it with the machine localhost and use localhost as the name instead the address in the web interface.
To add a server, copy it and change localhost to the name of the machine and change the IP address to the correct IP address of the machine you want to monitor. For more detailed instructions, see Possible Configuration-Parameters for Munin (Server).
Munin-Node (the client)
By default you can find the configuration at /etc/munin/munin-node.conf
. Additional information about this configuration can be found at Possible Configuration-Parameters for Munin-Node (Client).
Parameter | Default value | Possible values | Description |
---|---|---|---|
| 2 | 0..4 | 0 = Off, 4 = Maximal Verbose |
| /var/log/munin/munin-node.log | Logfile-Location | Where the log file should be stored |
| /var/run/munin/munin-node.pid | Pidfile-Location | Where the PID file should be stored |
| 1 | 1 or comment out and set setsid to 0 | Set to 1 to run in the background, or set to 0. |
| root | User | Run node below this user |
| root | Group | Run node below this group |
| yes | yes/1 or no/0 | Fork after bind to daemonize or not |
| ~$ | Expression for excluding Files | Regular expression to exclude files which match this expression (This command can be repeated) |
| ^127\.0\.0\.1$ | Expression for IP-Address | Regular expression of IP to allow access on node (This command can be repeated) |
| * | IP-Address or * for all | The address where the node will listen |
| 4949 | 1..65534 | The port where the node will listen |
| - | CIDR for IP-Address | Allows use of CIDR- notation See http://en.wikipedia.org/wiki/CIDR_notation |
| - | CIDR for IP-Address | Cancels (or negates) cidr_allow See http://en.wikipedia.org/wiki/CIDR_notation |
If the cidr_* parameter won't work in the configuration, use allow instead; which command is supported depends on the version of Perl Net::Server.