The Men & Mice Blog

Supervising BIND 9

Posted by Men & Mice on 1/21/15 2:59 PM

BIND 9 is a mature piece of software, and compared with its predecessors BIND 4 and BIND 8 it is noticeably more stable and secure. One reason for this is the "Design by contract" programming style used by the BIND 9 team; as a result, BIND 9 is very particular about the data it consumes, and about its own internal data structures. Once BIND 9 encounters an unexpected state in its data structures, it terminates the DNS server process rather than continue running with bad data (and thus potentially compromise security).

While this behavior has clear advantages in terms of security, it can adversely affect service uptime - BIND 9 had several incidents in the past years where BIND 9 terminated because of issues inside the code or data structures, such as "BIND 9 Resolver crashes after logging an error in query.c". And for all its security benefits, an end user unable to reach Facebook may not be terribly understanding in the event of an outage.

The real issue, however, is not that BIND terminates when it comes across bad data, but rather that the process cannot automatically restart after the fact; there is no "supervisor" process in BIND 9.

Some operating systems have a built-in solution: MacOS X has launchd, and the BIND 9 version Apple delivers with the OS is automatically restarted should it terminate unexpectedly. Solaris has SMF (Service Management Facility), and BIND 9 can be integrated into SMF. Recent versions of Ubuntu, RedHat Enterprise, SuSe Enterprise, and Fedora now all use systemd, which can also monitor processes and restart them if needed.

But for Unix and Linux operating systems that do not ship with a process supervisor solution, supervisord is a strong alternative, with the added benefit of being relatively easy to install and configure. Supervisord comes as a package with many Linux distributions, and also works on BSD distributions.

The configuration below is intended for RedHat 6, but should require only minor tweaks to run on other Unix systems as well.

Installation

Supervisord is written in Python (2.4 - 2.7) and can be installed from source (where we have to download and install all dependencies) or with the help of setuptools, which takes care of downloading and installing dependencies (Meld3 and ElementTree).

Full Installation instructions can be found at [http://supervisord.org/installing.html]

Automatic Installation

‣ download "setuptools" from [https://pypi.python.org/packages/source/s/setuptools/setuptools-9.1.tar.gz]

 shell> tar xfz setuptools-9.1.tar.gz 
shell> cd setuptools-9.1 
root-shell> python setup.py install

Once setuptools have been installed, run the following command to install Supervisor and all required dependencies:

 root-shell> easy_install supervisor 

Manual Installation

Supervisor and its dependencies can also be installed manually.

‣ download "setuptools" from [https://pypi.python.org/packages/source/s/setuptools/setuptools-9.1.tar.gz]

 shell> tar xfz setuptools-9.1.tar.gz 
shell> cd setuptools-9.1 
root-shell> python setup.py install

‣ download "Meld3" from [http://www.plope.com/software/meld3/meld3-0.6.5.tar.gz]

 shell> tar xfz meld3-0.6.5.tar.gz 
shell> cd meld3-0.6.5 
root-shell> python setup.py install

‣ download "ElementTree" from [http://effbot.org/media/downloads/elementtree-1.2.6-20050316.tar.gz]

 shell> tar xfz elementtree-1.2.6-20050316.tar.gz 
shell> cd cd elementtree-1.2.6-20050316 
root-shell> python setup.py install

‣ download "Supervisor" from [https://pypi.python.org/packages/source/s/supervisor/supervisor-3.1.3.tar.gz]

 shell> tar xfz supervisor-3.1.3.tar.gz 
shell> cd supervisor-3.1.3 
root-shell> python setup.py install

Installing startscript and sysconfig

‣ download the startscript from [https://raw.githubusercontent.com/Supervisor/initscripts/master/redhat-init-jkoppe] and place it in /etc/init.d/supervisord

 root-shell> cp redhat-init-jkoppe /etc/init.d/supervisord 
root-shell> chmod +x /etc/init.d/supervisord

‣ download the 'sysconfig' file from [https://raw.githubusercontent.com/Supervisor/initscripts/master/redhat-sysconfig-jkoppe] and place it in /etc/sysconfig/supervisord

 root-shell> cp redhat-sysconfig-jkoppe /etc/sysconfig/supervisord 

Installing Bind 9 from Men & Mice repositories

‣ download the BIND 9 RPM from [http://support.menandmice.com/download/bind/linux/redhat/6.x/]<arch>/<version>/

 root-shell> yum install ISCBIND-<version>-<flavor>RHL<arch>.rpm 
root-shell> mkdir /var/named 
root-shell> useradd -d /var/named -r named 
root-shell> chown -R named: /var/named

‣ create a BIND 9 configuration file '/etc/named.conf'

options { directory "/var/named"; dnssec-validation auto; }; 

‣ create an 'rndc' configuration

root-shell> rndc-confgen -a 

‣ verify the configuration

root-shell> named-checkconf -z 

A basic configuration file for BIND 9 "named"

Below is my basic /etc/supervisord.conf configuration file for one service, the BIND 9 DNS Server:

 [unix_http_server] 
file = /tmp/supervisor.sock 
chmod = 0777 
chown= nobody:nobody 

[rpcinterface:supervisor] 
supervisor.rpcinterface_factory = supervisor.
rpcinterface:make_main_rpcinterface 

[supervisorctl] 
serverurl=unix:///tmp/supervisor.sock 
[supervisord] 
logfile = /var/log/supervisord.log 
logfile_maxbytes = 10MB 
logfile_backups=10 
loglevel = info 
pidfile = /var/run/supervisord.pid 
identifier = supervisor 
directory = /tmp 

[program:named] 
command=/usr/sbin/named -u named -f 
process_name=%(program_name)s 
numprocs=1 
directory=/var/named 
priority=100 
autostart=true 
autorestart=unexpected 
startsecs=5 
startretries=3 
exitcodes=0,2 
stopsignal=TERM 
stopwaitsecs=10 
redirect_stderr=false 
stdout_logfile=/var/log/named_supervisord.log 
stdout_logfile_maxbytes=1MB 
stdout_logfile_backups=10 
stdout_capture_maxbytes=1MB

Starting supervisord

With the configuration file in place, we can start supervisord. Make sure that BIND 9 is not started or you will end up with two instances of the BIND 9 server running, which isn't recommended. Also make sure that supervisord will be started on reboot of the server, either through a startscript or other means. Note that the supervisord packages bundled with Linux distributions install a startscript.

 root-shell> /etc/init.d/supervisord start 
root-shell> rndc status 
version: 9.9.6-P1 <id:3612d8fb> 
number of zones: 98 
debug level: 0 
xfers running: 0 
xfers deferred: 0 
soa queries in progress: 0 
query logging is OFF 
recursive clients: 0/0/1000 
tcp clients: 0/100 

server is up and running 
root-shell> ps -ef 
[...] 
root 10906 0.0 2.5 209096 12988 ? Ss 19:55 0:00 /usr/bin/python /usr/bin/supervisord -c /etc/supervisord.conf 
named 10908 0.7 1.6 44292 8112 ? S 19:55 0:00 /usr/sbin/named -u named -f 
root 10910 0.0 0.2 110228 1156 pts/0 R+ 19:55 0:00 ps aux 

root-shell> supervisorctl 
status 
named RUNNING pid 10908, uptime 0:03:19 
root-shell> chkconfig --add supervisord 
root-shell> chkconfig supervisord on

Great, supervisord has started, and it also started the BIND 9 process "named". DNS is working now.

Simulating a BIND 9 crash

To simulate a BIND 9 crash, we "kill" the BIND 9 named process:

root-shell> killall -9 named 

Supervisord should detect that the running BIND 9 process has terminated, and start a new one. DNS is still up and running.

Controlling supervisord

Supervisord can be controlled from the command line using the supervisorctl command. A list of all a control commands can be found with "help", and a description of each command with "help command":

 shell> supervisorctl help 
default commands (type help ): 
===================================== 
add clear fg open quit remove restart start stop update 
avail exit maintail pid reload reread shutdown status tail version 

shell> supervisorctl help status 
status Get all process status info. 
status Get status on a single process by name. 
status Get status on multiple named processes. 

shell> supervisorctl status named 
RUNNING pid 25770, uptime 0:00:12 

shell> supervisorctl stop named  named: stopped 

shell> supervisorctl start named 
named: started

Now, whenever there is a triggered assertion error in the code BIND 9 will terminate, but supervisord will bring it back from the dead. Your DNS service stays up, and your users and customers stay happy.

Read the supervisord documentation on how to setup event notifications, so that you get an e-mail notification should BIND 9 restart (should the outage be caused by a security vulnerability you might want to report it to bind9-bugs@isc.org as well).

Of course supervisord can be used to restart other processes as well, including other types of DNS Servers (NSD, Unbound, dnsmasq ...).

Topics: DNS, Linux, BIND, BIND 9, Supervisord, Red Hat

Why follow Men & Mice?

The Men & Mice blog publishes educational, informational, as well as product-related material for everyone and anyone interested in IP Address Management, DNS, DHCP, IPv6, DNSSEC and more.

Subscribe to Email Updates

Recent Posts