About eight or 10 years ago, many experts predicted that media backups would be replaced by disk-to-disk backups. Yet today, most System i installations are doing some form of tape backup. Why? Is it a comfort zone? Is it really needed? And if we
are stuck with some form of backup, can we automate it?
I recently asked our staff, "Will backups ever go away?" Their short answer was "No." The truth is that the backup copies of our data are valuable assets. The backup media and other aspects of the backup task may change (for example, we may perform the backups through the network, across fiber optics, or by replicating data to a high availability system), but the need for backups will never go away.
Large Data Centers Improve Focus on Backups
In today's data center environment, very few companies have eliminated all forms of tape backup. My experience is with System i customers running large-scale, mainframe-like computer centers. Most of these customers have high-availability systems and high-speed networks, yet they continue to do fairly traditional backups. Their data centers reflect the decade-long trend toward managing computing centrally. They have grown through the consolidation of multiple remote systems into large-scale, multi-terabyte data centers.
This centralization offers many benefits, including enhanced security, improved disaster protection and fault tolerance, and, overall, a more consistent computing environment. Looking specifically at the effects of centralization on backups, we have noticed better attention to backups, disaster plans, and hot-site recovery testing. The downside has been that saving more data requires a longer backup window.
I like to think of backups as being like an insurance policy. Performing the backup is like paying the premium each month. The payoff comes when disaster strikes and you can successfully restore your company's data at a recovery center without jumping through hoops. Yes, the real backup issue is recoverability. Along with recoverability, automation is also key.
Six Key Areas to Automate
Automating computer systems has two huge benefits: First, it increases staff productivity, and second, it reduces errors (versus doing tasks manually). If we start to think about automated backups as an insurance policy, what features should we look for in such a policy? I think there are six key areas:
- Tape/media tracking
- Robotics
- Virtual tape
- Restricted state
- Restoration reporting
- Regulations
Tape/Media Tracking
Whether you are using a disk-to-disk network backup or a disk-to-tape solution, it is important to have a system that eliminates manual recording of storage information. A good tape/media tracking feature provides an easy-to-use inventory of what is stored on the backup media and where the media are located. Ideally, the system also provides a central repository of information for all systems in the data center, defines media retention periods, provides easy-to-use reporting, and tracks ad hoc backups by IT staff.
Robotics
Today, robotic devices can load tapes by infrared barcodes, eliminating the need to manually mount tapes. While this technology is needed to run totally automated backups with no operator in the data center, it introduces another level of complexity. When you acquire this equipment, be sure to get all the vendor training you can. Unfortunately, hardware vendors don't seem to have a lot of knowledge about backup and recovery software, so it falls to the users to try to successfully marry the two. Once you get over that hurdle and have the hardware and software working in harmony, you can enjoy the benefits of automation.
Virtual Tape
When IBM introduced V5R4 last year, it included virtual tape. System administrators can now save data to a directory in the V5R4 Integrated File System (IFS) to perform what is essentially a disk-to-disk backup. The IFS support is important because virtual tape can be on an auxiliary storage pool (ASP) that is local or remote. Using a high-speed optical connection, backup speed is very good, and media errors (which can stop unattended backups cold) are eliminated. In my mind, automated virtual tape does you no good if your backups are still launched manually by an operator.
IBM's idea for virtual tape was that it would mimic saving to a tape drive. Therefore, IBM retained the concept of a "volume," and that makes tracking backups to virtual tape easier. It gives backup-and-recovery software the ability to support the technology through media management features. If you need more information about this technology, read the IBM Redbook
i5/OS V5R4 Virtual Tape: A Guide to Planningand Implementation.
Restricted State
When you back up the operating system on the System i, there must be no other activity on the system. This is called a "restricted state." If you can afford downtime once a week, I recommend automating restricted state backups. With increased tape capacity and software automation tools, you can run an automated and monitored restricted state backup completely unattended. For companies that need 24x7 system availability, this is a more painful issue. From a technical standpoint, you don't really need to do a SAVSYS operation in i5/OS. The dynamic part of the OS can be backed up by commands that do not require a restricted state. However, very few people skip this backup. It is an ingrained habit on this platform, and a recent SAVSYS makes it easier to restore the system when needed. A new trend to eliminate downtime is to use HA software to replicate the system and then run restricted-state backups on the HA system.
Restoration Reporting
As backups complete, the data restoration plan should be automatically updated in the database. Then, the updated plan should be transmitted automatically (using email or FTP) to your remote hot site or off-site storage. You can put your backup reports into PDF format automatically and send them where needed, too. The restoration process should be periodically tested at the hot site under realistic conditions. In speaking with experts at both IBM and SunGard, I have concluded that too many people do not take into consideration that disasters do not have a timetable. Administrators do a full-system save one weekend and a system test the next. This is not realistic. You need to test your recoverability with the assumption that the failure has occurred at a less-than-optimal time, such as midday.
Regulation
If your team cannot figure out how to automate the processes and procedures around backup and recovery, you will continually spend enormous amounts of time handling Sarbanes-Oxley (SOX) and other regulatory requirements. Backups will not be going away anytime soon, and you should focus on alleviating the manual paper trail in this area. Your tape management systems should provide the reporting you need for your auditors. If they don't, you should look for a better, alternative solution.
The need for backups is not going away. We will continue to see technological improvements. Backup media are likely to become denser and to provide higher performance. SAN and HA technology will improve in usability over time. Solid-state technology (such as thumb drives) will become cheaper, eating away at the standard technologies of disk and tapes. Disk drives and tapes are mechanical devices with moving parts and friction—things that are far more likely to break than electronic parts. Electronics are also magnitudes faster than mechanics. So, in the future, your backup device might be something like a big thumb drive! In the end, I recommend that all data center management teams evaluate technological changes based on whether they help the center be more automated or not.
Tom Huntington is Vice President of Technical Services for Help/Systems, Inc ., where he has worked for the past 18 years. He has an extensive background in computer operations and iSeries-centric
development. He works with domestic and international customers in a variety ofindustries of different sizes to help promote Help/Systems' Robot Automated Operations Solution, a system automation suite, and SEQUEL, a business intelligence product.