Author: Corinne McKay and Daniel J. Urist
The hard drive is the single most likely point of failure in your computer, and the most critical component. While power supplies also frequently fail, modern journaled file systems will generally keep you from losing your data if this happens. If your machine has a single hard drive and nonexistent or insufficient backups, losing the hard drive may literally mean losing your business. A close relative of ours learned this the hard way. He ran a home office on a single hard drive machine with no reliable backup system. When the hard drive died, he lost two years’ worth of work, and spent several thousand dollars on data recovery that took several weeks and was only partially successful. Using RAID can turn a hard drive failure from a business-ending catastrophe into a minor inconvenience.
RAID, or Redundant Array of Independent (or Inexpensive) Disks, is a configuration of multiple hard drives to achieve fault tolerance and/or performance benefits on your system. For our purposes, the RAID level we’re interested in is RAID 1, also known as mirroring, meaning that the two drives contain identical information. Performance-wise, RAID 1 is theoretically a little faster for reads and a little slower for writes than a single hard drive, but this depends on the implementation. Practically speaking, there’s no significant performance difference between using RAID 1 and using a single disk. Both hard drives should have the same capacity; it’s simplest to use two of the same model.
When adding a second hard drive to your machine, be aware that the additional heat generated by the second drive may require additional cooling fans. Refer to the case manufacturer’s documentation for recommendations.
You can implement RAID 1 in two different ways, using software or hardware. Hardware RAID requires a RAID-capable hard drive controller. The RAID device is transparent to the operating system and looks and performs like a single disk as far as the OS is concerned. This means that all RAID administration is done through the hardware device, requiring software that talks directly to the hardware device, usually including a BIOS-level program to configure the device. Some higher-end hardware RAID devices also provide administrative software that runs under the operating system. However, the software, which always comes from the hardware RAID vendor, is usually not free, and it may not work with your version of Linux.
An advantage to hardware RAID is that many recent motherboards have built-in hardware RAID, and the user doesn’t have to do any operating system configuration to set up RAID (although you will need a driver that supports the RAID device itself.) Most popular hardware RAID devices are supported under Linux. A significant disadvantage is that it may be impossible to convert a non-RAID machine to a RAID machine without completely backing up the existing hard drive, rebuilding it with a second drive as a RAID device, and then restoring your data. Another major issue is failure notification. If there is operating system-level administrative software, it may provide automated failure notification, but otherwise, you’ll have to carefully monitor your log files for failures.
Software RAID achieves the same end as hardware RAID, but software RAID is handled by the operating system rather than by a hardware RAID controller. Creating and installing onto software RAID devices is provided by most major Linux distributions. You can also convert an existing single disk installation into a RAID 1 installation, although it’s tricky. Advantages of Linux software RAID include automated failure notification by email that is easy to set up, and the fact that software RAID can provide more redundancy than hardware RAID.
While most distributions allow you to set up software RAID at install time, you will probably have to manually configure the RAID subsystem to send you email alerts of failures. You can do this by editing the mdadm.conf file (usually /etc/mdadm.conf or /etc/mdadm/mdadm.conf) and adding a MAILADDR directive; see the mdadm man page for details. You will also need to have a working local mail subsystem.
It’s critical to install the RAIDed hard drives on separate controller channels if you’re using your motherboard’s on-board IDE controller. If this isn’t possible because you already have too many IDE devices, then you will need to install a second IDE controller. This has the added benefit of providing redundancy at the controller level, and this configuration is popular in enterprise-level installations. While a controller failure is far less likely than a hard drive or power supply failure, it will mean downtime if both of your RAID disks are attached to a single controller. However, it’s rarely catastrophic, since it probably won’t affect the data on your hard drives.
Backups and disaster recovery
Without a backup and disaster recovery plan, your small business is teetering on the edge of disaster. The first step toward safety is to assess your backup and recovery needs. Do you need data backups only, high availability, or a full disaster recovery and business continuation plan? Do you need to back up your applications and operating system, or just your user files? When developing your backup system, your guiding principle should be: how much data are you willing to lose, and how much effort and money are you willing to spend to make sure that your business isn’t disrupted in the event of a catastrophic event?
Although it involves a great deal of worst case scenario planning, needs assessment is critical to the continuation of your business in the event of a disaster. Various factors, technical and otherwise, should enter into your decision. For instance, how long can you go without working before your clients look for someone else to do the job, or before you have to take another job to generate income? Consider whether the information you work with is available elsewhere. If not, backups are doubly important. Finally, consider whether your business needs a physical location to work from. In some business sectors, home office workers can get by with a laptop in a cafe with Wi-Fi, while those who have daily face-to-face meetings with clients must consider where to meet if the office is a smoking ruin.
The nature of your business will also dictate how long you need to keep your backups, and how long your backup media needs to last. For instance, if you work with financial records in the US, you may be legally obligated to have your data accessible for seven years, meaning your media needs to last that long. Recent legislation in the US, such as the Sarbanes-Oxley Act, has placed even more emphasis on corporate data security.
The primary purpose of backups is not to restore your machine in the event of an avoidable hardware failure. If you’ve done your homework and built a highly available machine with RAID and ECC RAM, the main purpose of backups should be to retrieve deleted files and recover from a true disaster, such as a flood or fire. An event such as a failed hard drive should be an inconvenience, not a disaster. It’s a good idea to archive your whole system on a periodic basis, though you’re unlikely to run backups every day if it involves sitting at your computer and swapping five CDs while the backup program runs. Your daily data backups are most likely to get done on schedule if they run unattended.
Your choice of Linux distribution will affect how you do backups. Choose a generic install, trim out the packages you don’t need, and there’s not much of an imperative to back up your OS; just download it again if you have to. Modern package managers such as APT and yum make it simple to reinstall packages when necessary. In addition, not backing up your OS makes it easier to fit all of your backed up material on one disk, so your backups can run unattended.
The next step is to choose your backup media. Excluding the option of third-party Internet-storage services, or simply emailing yourself copies of important work-in-progress, options for backup include tape, hard drives, and optical media such as CDs or DVDs. Let’s look at the pros and cons of each.
Tape used to be the preferred backup medium, and digital linear tape (DLT) format is still the gold standard for enterprise backups. Tape is impractical for most home use because hard drive volume has outstripped the capacity of low-cost tape formats, leaving only the option of buying an extremely expensive enterprise-quality tape drive and expensive tapes to go with it. One advantage of modern DLT backup tapes is that they are extremely stable. Depending on the nature of your business, there are cases where buying a tape drive may make sense, particularly if you need to keep your backed-up data for a long time. DLT tapes are rated to last up to 30 years under optimal storage conditions.
Hard drives have recently gained in popularity as a backup medium due to their low cost, large capacity, and high speed. Options include using a removable USB hard drive, putting extra hard drives in your primary computer, or purchasing another computer to use as a backup device. While you might be able to buy an additional internal hard drive for as little as $50, the disadvantage of backing up to a hard drive is that the drive itself is relatively fragile, and it’s not portable when it’s installed inside a computer. For backing up very small amounts of data and if portability is a major concern, a keychain-style USB flash drive is a good option, since it is physically more stable than a conventional hard drive.
Optical media such as CDs and DVDs are good options for home office users. Almost every computer produced today has an optical media drive. The media itself is inexpensive, relatively stable, and has reasonable storage capacity. CDs and DVDs are also extremely portable, making them a good choice for off-site backups. Downsides include the fact that optical media is slow to write to and slower to read from than a hard drive. Optical media also does not last forever. Most unrecorded optical media is estimated to have a shelf life of between five and 10 years; how long your recorded disks will last depends on how you store them and whom you believe, so if this is a concern, you should check the manufacturer’s information on the product. Estimates range from 20 to 100 years for CD-RW, although some reports advise not using CD-RW for long-term backups at all.
Selecting backup software
For making a complete clone of your system, a program such as Mondo Rescue is excellent. Traditional utilities such as dump, tar, and cpio also work but are less convenient to use. All these programs can back up to your choice of media.
There are fewer good, simple options for unattended backups of user data. In our home office, we use the application Sitback, which can back up to your choice of media. Sitback’s virtues include easy scheduling of backups to run at a time when you aren’t working and reliable notification of backup success and failure. Sitback stores its archives in tar format, which means that you don’t need Sitback to read the archives, which is often an issue with proprietary backup software. In our case, we schedule Sitback to run at 3 a.m. and pare down our data to fit everything on one CD-RW. We run a full backup each night and no incremental backups. The backup takes seven CD-RWs, each labeled for a day of the week, and we swap the disks each morning. Sitback does not directly support encryption, so it’s not a good option for offices handling confidential data.
Another option for user data backups is the combination of DAR (Disk Archive), its scheduling program SaraB, and the GUI front end KDar. These programs work well for installations that won’t fit on one piece of optical media and that alternate between full and incremental backups, or that need to encrypt backups. SaraB is a powerful scheduler and can support many configurations of backup scheduling and automated failure notification. KDar’s easy-to-use restore function is especially useful if someone who works with you is less technically skilled and needs to be able to restore their own data. However, KDar cannot schedule unattended backups, and the DAR suite is not designed to work with tapes. It is also more work to set up than Sitback if you need only one disk’s worth of backups made in the same way every day.
Whatever backup method you choose, if you keep data in an RDBMS such as MySQL or PostgreSQL, you will need to follow special procedures (usually dumping the databases or shutting down the server) to back up the databases.
Offsite backups are worth considering, since all of your carefully labeled backup disks won’t do much good if your landlord locks you out of your house or your office floods. “Offsite” can mean various things: colocating a machine to use for hard drive backups, storing some of your most critical data online with an Internet storage service, or keeping backup disks someplace other than your house. If your data isn’t sensitive, this could be as simple as keeping the disks at a friend’s house. If you have secure or confidential client data, a safe deposit box may be a better bet; if longevity is a concern, check the media manufacturer’s storage condition recommendations, or consider using a data storage service, since some backup media must be stored under controlled conditions. If your offsite backups contain any confidential information, they should be encrypted as well.
Once you have backups in place, it’s critical to test them. There’s nothing as tedious as testing backups, but nothing as horrifying as realizing that you’ve lost a month’s worth of work on a project due in an hour, and that all of your carefully labeled backup disks are blank. So, we offer some suggestions for testing your backups.
First, set up a test schedule; for example, test your backups on the first business day of every month. If you have a large number of backups, test a random sample to make sure you can read them, and that they have the data that you expect. In addition, practice preventive medicine; if you have any media errors, discard the problem media immediately. Never assume that an error message is your backup program malfunctioning; follow up every error until you figure out what went wrong, then fix it. Make testing a priority. As urgent as it seems to meet today’s deadline, the month’s worth of data on your backup media is probably more important to your business.
With a backup plan in place, it’s much easier to concentrate on your work rather than on the ominous grinding noise coming from your hard drive. A little bit of effort up front can save not only countless hours and dollars later, but also possibly your business itself in the event of a true disaster.
Conclusion
The most important best practice for the Linux home office is having an attitude of professionalism toward your computer equipment. Look at your computer as a business production system, not as a do-it-all home entertainment system that you also use for business. By having the same attitude of professionalism toward your home office as you do toward your business, you’ll keep things running smoothly and profitably for a long time to come.