Keep your Web site online with a High Availability Linux Apache cluster

Failover clusters are used to ensure high availability of system services and applications even through crashes, hardware failures, and environmental mishaps. In this article, I’ll show you how to implement a rock-solid two-node high availability Apache cluster with the heartbeat application from The High-Availability Linux Project. I tested the cluster on Fedora Core 5, CentOS 4.3, and Ubuntu 6.06.1 LTS server distributions.

In a cluster environment, a high availability (HA) system is responsible for starting and stopping services, mounting and dismounting resources, monitoring the system availability in the cluster environment, and handling the ownership of the virtual IP address that’s shared between cluster nodes. The heartbeat service provides the basic functions required for the HA system.

The most common cluster configuration is called standby configuration, as described here. In the standby cluster configuration, one node performs all the work while the other node is idle. Heartbeat monitors health of particular service(s) usually through a separate Ethernet interface used only for HA purposes using special ping. If a node fails for some reason, heartbeat transfers all the HA components to the healthy node. When the node recovers, it can resume its former status.

Installation and configuration

To test High Availability Linux, you need a second Ethernet adapter on each node to devote to heartbeat. Install the Apache Web server and the heartbeat program on both nodes. If the heartbeat package is not in any repository of your favorite distribution, you can download it. On my CentOS servers, I used yum to install the necessary software:

yum install -y httpd heartbeat

The configuration files for heartbeat are not in place when the software is installed. You need to copy them from the documentation folder to the /etc/ha.d/ folder:

cp /usr/share/doc/heartbeat*/ha.cf /etc/ha.d/ cp /usr/share/doc/heartbeat*/haresources /etc/ha.d/ cp /usr/share/doc/heartbeat*/authkeys /etc/ha.d/

In the /etc/hosts file you must add hostnames and IP addresses to let the two nodes see each other. In my case it looks like this:

192.168.1.1 node1.example.com node1 192.168.1.2 node2.example.com node2

Make sure you have the exact same /etc/hosts file on both nodes and that you’re able to ping both nodes. You can just copy the file from one node to another using secure copy:

scp /etc/hosts root@node2:/etc/

Next, modify the configuration file /etc/ha.d/ha.cf. Edit the following entries in order to get heartbeat to work:

logfile /var/log/ha-log #where to log everything from heartbeat logfacility local0 #Facility to use for syslog/logger keepalive 2 # the time between the heartbeats deadtime 30 #how long until the host is declared dead warntime 10 #how long before issuing "late heartbeat" warning initdead 120 # Very first dead time (initdead) udpport 694 #udp port for the bcast/ucast communication bcast eth1 #on what interface to broadcast ucast eth1 10.0.0.1 #this is a 2-node cluster, so no need to use multicast here auto_failback on #we want the resources to automatically fail back to its primary node node node1.example.com #the name of the first node node node2.example.com #the name of the second node

This are the basic options necessary for heartbeat to work. The file has to be configured identically on both nodes, except for the “ucast” part where you define the IP address of peer to send packets to.

The next file is /etc/ha.d/haresources. In this file you need to define the master node name, virtual IP address (cluster IP), and which resources to start. In our case, we’re starting the Apache Web server.

We need only one line of data here:

node1.example.com 192.168.1.5 httpd

Make sure the file is exactly the same on both nodes. Note that the resource name is the name of the init script located in the /etc/init.d folder. If the resource name is not exactly the same as in /etc/init.d/, heartbeat will not be able to find it when it tries to read it and both Apache and heartbeat will fail to start.

The last heartbeat-related file is /etc/ha.d/authkeys. This file must also be the same on both nodes, and it needs to be readable and writable only by the root user. If the permissions are different from what heartbeat expects, heartbeat will refuse to start. Make sure you have the file configured like this:

auth 1 1 crc

And make sure it’s readable and writable by root only:

chmod 600 /etc/ha.d/authkeys

Now it’s time to configure the Apache service. We want Apache to listen on the virtual IP address 192.168.1.5, and we need to point the Apache document root to the /data mount point where our Web files will be kept. Note that storage for Apache can be practically anything from the local filesystem folder to a storage area network. Of course, there is no point in a failover cluster if the same data is not available for both nodes. If you don’t own an external network-attached storage device (such as a Fibre Channel storage unit) you can mount any SMB, NFS, iSCSI, or SAN filesystem as a local folder so that each node can access the data when it is active. This is done by modifying the following entries in the /etc/httpd/conf/httpd.conf file (at least for the CentOS distribution):

Listen 192.168.1.5:80 DocumentRoot "/data" <Directory "/data">

It’s important for the Apache service to not start automatically at boot time, since heartbeat will start and stop the service as needed. Disable the automatic start with the command (on a Red Hat-based system):

chkconfig httpd remove

Make sure you have the same Apache configuration on both nodes.

Now we test

At this point we’re done with configuration. Now it’s time to start the newly created cluster. Start the heartbeat service on both nodes:

/etc/init.d/heartbeat start

Watch the /var/log/ha-log on both nodes. If everything is configured correctly, you should see something like this in your log files:

Configuration validated. Starting heartbeat 1.2.3.cvs.20050927 heartbeat: version 1.2.3.cvs.20050927 Link node1.example.com:eth1 up. Link node2.example.com:eth1 up. Status update for node node2.example.com: status active Local status now set to: 'active' remote resource transition completed. Local Resource acquisition completed. (none) node2.example.com wants to go standby [foreign] acquire local HA resources (standby). local HA resource acquisition completed (standby). Standby resource acquisition done [foreign]. Initial resource acquisition complete (auto_failback) remote resource transition completed.

Now test the failover. Reboot the master server. The slave should take over the Apache service. If everything works well, you should see something like this:

Received shutdown notice from 'node1.example.com'. Resources being acquired from node1.example.com. acquire local HA resources (standby). local HA resource acquisition completed (standby). Standby resource acquisition done [foreign]. Running /etc/ha.d/rc.d/status status Taking over resource group 192.168.1.5 Acquiring resource group: node1.example.com 192.168.1.5 httpd mach_down takeover complete for node node1.example.com. node node1.example.com: is dead Dead node node1.example.com gave up resources. Link node1.example.com:eth1 dead.

And when the master comes back online again, he should take over the
Apache service:

Heartbeat restart on node node1.example.comheartbeat Link node1.example.com:eth1 up. node2.example.com wants to go standby [foreign] standby: node1.example.com can take our foreign resources give up foreign HA resources (standby). Releasing resource group: node1.example.com 192.168.1.5 httpd Local standby process completed [foreign]. remote resource transition completed. Other node completed standby takeover of foreign resources.

Conclusion

That’s all it takes to build a low-cost highly available Web server cluster. There are of course many commercial products that accomplish the same goal, but for the production needs for small business or any other institution, High Availability Linux and heartbeat are an excellent alternative.

RELATED ARTICLESMORE FROM AUTHOR

Xen 4.19 is released

Advancing Xen on RISC-V: key updates

AI Produces Data-driven OpenFOAM Speedup (HPC Wire)

Delivering Prime Training Deals – 2 DAYS ONLY

Why You Need to Know About Event Modeling: —An Intro

RELATED ARTICLES MORE FROM AUTHOR