Now, go to your /etc/hosts file and add a couple lines, one for your
primary and another for your secondary redundant server. Call one
server1, the other server2, and finally, call one mail, and set the IP
addresses appropriately. It should look something like this:
192.168.1.1 server1
192.168.1.2 server2
192.168.1.5 mail
Finally, on both your master and slave server, make a folder
called /replicated, and add the following line to the /etc/fstab file:
/dev/drbd0 /replicated ext3 noauto 0 0
Configuring DRBD
After you've done that, you have to set up DRBD before moving forward
with Heartbeat. In my setup, the configuration file is /etc/drbd.conf, but
that can change depending on distribution and compile time options, so
try to find the file and open it now so you can follow along. If you
can't find it, simply create one called /etc/drbd.conf.
Listing 1 is my configuration file. I go over it line by line and add
explanations as comments that begin with the # character.
Now, let's test it by starting the DRBD driver to see if everything works
as it should. On your command line on both servers type:
drbdadm create-md drbd0; /etc/init.d/drbd restart; cat /proc/drbd
If all goes well, the output of the last command should look something
like this:
0: cs:Connected st:Secondary/Secondary ds:Inconsistent/Inconsistent r---
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
resync: used:0/7 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
Note: you always can find information about the DRBD status by typing:
cat /proc/drbd
Now, type the following command on the master system:
drbdadm -- --overwrite-data-of-peer primary drbd0; cat /proc/drbd
The output should look something like this:
0: cs:SyncSource st:Primary/Secondary ds:UpToDate/Inconsistent r---
ns:65216 nr:0 dw:0 dr:65408 al:0 bm:3 lo:0 pe:7 ua:6 ap:0
[>...................] sync'ed: 2.3% (3083548/3148572)K
finish: 0:04:43 speed: 10,836 (10,836) K/sec
resync: used:1/7 hits:4072 misses:4 starving:0 dirty:0 changed:4
act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
This means it is syncing your disks from the master computer that is set
as the primary one to the slave computer that is set as secondary.
Next, create the filesystem by typing the following on the master system:
mkfs.ext3 /dev/drbd0
Once that is done, on the master computer, go ahead and mount the drive
/dev/drbd0 on the /replicated directory we created for it. We'll have to
mount it manually for now until we set up Heartbeat.
An important part of any redundant solution is properly preparing your
services so that when the master machine fails, the slave machine can
take over and run those services seamlessly. To do that, you have to move
not only the data to the replicated DRBD disk, but also move the
configuration files.
Let me show you how I've got Sendmail set up to handle the mail and store
it on the replicated drives. I use Sendmail for this example as it is one
step more complicated than the other services, because even if the
machine is running in slave mode, it may need to send e-mail
notifications from internal applications, and if Sendmail can't access
the configuration files, it won't be able to do this.
On the master machine, first make sure Sendmail is installed but stopped.
Then create an etc directory on your /replicated drive. After that, copy
your /etc/mail directory into the /replicated/etc and create a symlink
from /replicated/etc/mail to /etc/mail.
Next, make a var directory on the /replicated drive, and copy /var/mail,
/var/spool/mqueue and any other mail data folders into that directory.
Then, of course, create the appropriate symlinks so that the new folders
are accessible from their previous locations.
Your /replicated directory structure should now look something like:
/replicated/etc/mail
/replicated/var/mail
/replicated/var/spool/mqueue
/replicated/var/spool/mqueue-client
/replicated/var/spool/mail
And, on your main drive, those folders should be symlinks and look
something like:
/etc/mail -> /replicated/etc/mail
/var/mail -> /replicated/var/mail
/var/spool/mqueue -> /replicated/var/spool/mqueue
/var/spool/mqueue-client -> /replicated/var/spool/mqueue-client
/var/spool/mail -> /replicated/var/spool/mail
Now, start Sendmail again and give it a try. If all is working well,
you've successfully finished the first part of the setup.
The next part is to make sure it runs, even on the slave. The trick we
use is copying the Sendmail binary onto the mounted /replicated drive and
putting a symlink to the binary ssmtp on the unmounted /replicated
folder.
First, make sure you have ssmtp installed and configured on your system.
Next, make a directory /replicated/usr/sbin, and copy /usr/sbin/sendmail
to that directory. Then, symlink from /usr/sbin/sendmail back to
/replicated/usr/sbin/sendmail.
Once that's done, shut down Sendmail and unmount the /replicated drive.
Then, on both the master and slave computers, create a folder
/replicated/usr/sbin and a symlink from /usr/sbin/ssmtp to
/replicated/usr/sbin/sendmail.
After setting up Sendmail, setting up other services like Apache and
PostgreSQL will seem like a breeze. Just remember to put all their data
and configuration files on the /replicated drive and to create the
appropriate symlinks.
Configuring Heartbeat
Heartbeat is designed to monitor your servers, and if your master server
fails, it will start up all the services on the slave server, turning it
into the master. To configure it, we need to specify which servers it
should monitor and which services it should start when one fails.
Let's configure the services first. We'll take a look at the Sendmail we
configured previously, because the other services are configured the same
way. First, go to the directory /etc/heartbeat/resource.d. This directory
holds all the startup scripts for the services Heartbeat will start up.
Now add a symlink from /etc/init.d/sendmail to /etc/heartbeat/resource.d.
Note: keep in mind that these paths may vary depending on your Linux
distribution.
With that done, set up Heartbeat to start up services automatically on
the master computer, and turn the slave to the master if it fails.
Listing 2 shows the file that does that, and in it, you can see we have
only one line, which has different resources to be started on the given
server, separated by spaces.
The first command, server1, defines which server should be the default
master of these services; the second one, IPaddr::192.168.1.5/24, tells
Heartbeat to configure this as an additional IP address on the master
server with the given netmask. Next, with datadisk::drbd0 we tell
Heartbeat to mount this drive automatically on the master, and after
this, we can enter the names of all the services we want to start up—in
this case, we put sendmail.
Note: these names should be the same as the filename for their startup
script in /etc/heartbeat/resource.d.
Next, let's configure the /etc/heartbeat/ha.cf file (Listing 3). The main
things you would want to change in it are the hostnames of the
master/slave machine at the bottom, and the deadtime and initdead. These
specify how many seconds of silence should be allowed from the other
machine before assuming it's dead and taking over.
If you set this too low, you might have false positives, and unless
you've got a system called STONITH in place, which will kill the other
machine if it thinks it's already dead, you can have all kinds of
problems. I set mine at two minutes; it's what has worked best for me,
but feel free to experiment.
Also keep in mind the following two points: for the serial connection to
work, you need to plug in a crossover serial cable between the machines, and
if you don't use a crossover network cable between the machines but
instead go through a hub where you have other Heartbeat nodes, you have
to change the udpport for each master/slave node set, or your log file
will get filled with warning messages.
Now, all that's left to do is start your Heartbeat on both the master and
slave server by typing:
/etc/init.d/heartbeat start
Once you've got that up and running, it's time to test it. You can do
that by stopping Heartbeat on the master server and watching to see
whether the slave server becomes the master. Then, of course, you might
want to try it by completely powering down the master server or any other
disconnection tests.
Congratulations on setting up your redundant server system! And,
remember, Heartbeat and DRBD are fairly flexible, and you can put
together some complex solutions, including having one server being a
master of one DRBD partition and a slave of another. Take some time, play
around with them and see what you can discover.
Pedro Pla (pedropla@pedropla.com) is CTO of the Holiday Marketing
International group of companies, and he has more than ten years of Linux
experience.