分类: 服务器与存储
2007-01-30 14:18:59
he need for business continuity has driven demand for quick and robust disaster-recovery solutions, with data replication rising to the fore as a key enabling technology.
Unfortunately, many traditional replication approaches have forced customers to make difficult trade-offs in acquisition cost, management complexity, data consistency, distance, and performance. Enterprises can no longer afford to make the sort of compromises they have made historically when implementing replication.
The data replication market is currently at a major inflection point, with advanced technologies available that cut through the traditional cost and complexity to enable full replication without tradeoffs. This makes it a logical time to consider implementing remote replication, but users need to be aware of the continued challenges involved in replication and should be aware of the newer replication technologies available before they jump on the replication bandwagon.
Over the past five years, replication technologies have enjoyed heightened adoption. Based on the Taneja Group’s research, there are several factors behind this trend. At the highest level, corporations are using replication technologies across a much broader swath of infrastructure than ever before. We see the four main drivers as
Although the topic of disaster recovery is on everyone’s mind and replication adoption has increased, it can still be difficult and costly to implement a remote replication solution.
Disaster recovery in today’s environments is a complicated equation that must weigh the risk of an outage and the importance of the application against the cost to administer and procure the technologies. Based on the results of our end-user surveys, three major challenges emerge when one is considering a replication solution:
Traditionally, disaster-recovery solutions, including replication, have been costly. Replication and mirroring technology built into storage systems from the large vendors typically comes with a high price tag, not to mention the cost of the overall infrastructure needed to support site-to-site disaster recovery.
We recommend that all IT purchasers conduct a thorough return on investment (ROI) calculation before purchasing replication solutions. Specifically, users must understand the cost tradeoffs between using Fibre Channel or IP connectivity, including understanding the costs per network port and the costs of individual replication and mirroring solutions.
Deploying replication technology has often been fraught with pitfalls. In site-to-site recovery scenarios, two operations must occur flawlessly for a successful recovery to occur. Synchronizing two copies of data and ensuring the backup site always has a consistent recovery point is a non-trivial exercise.
There is also significant complexity in ensuring the environment is preserved perfectly between locations. Applications on hosts must have the exact same operating environment (including operating system and application images) as the primary. Moreover, on a recovery at the secondary site, the storage volumes must be quickly mounted and accessible to the proper hosts to hit low recovery time objectives (RTOs). If this is not done properly, a lengthy recovery and consistency checking process must occur before the system can come up. In short, replication and site-to-site disaster recovery involve a complex set of manual tasks.
For these reasons, we recommend that end users do a full evaluation of any replication product and establish metrics on how long it takes to accomplish specific administrative tasks, such as mounting a volume on the secondary site. Although ease of use can be a soft-value criterion, it is absolutely essential that products be benchmarked against management complexity since replication must work flawlessly when it is needed most.
Overall replication performance and recovery point objective (RPO) is gated by the bandwidth and latency of the link between the two sites. Provisioning the link and determining the bandwidth requirements of any replication deployment are critical planning items.
In general, there is a direct relationship between how often data changes on the primary system and the amount of bandwidth that is consumed. Some vendors replicate using thin-provisioned volumes, which means only actual changes to the data-and not allocated but unwritten capacity-are replicated. The algorithm that a replication vendor uses will also dramatically influence how much bandwidth is consumed and whether a lower-cost link will be sufficient.
We recommend that users look for a replication product with advanced bandwidth shaping algorithms. In some replication products, administrators can set quality of service (QoS) thresholds to either throttle or speed up the replication. We also recommend looking at solutions that replicate thin-provisioned volumes, as this can also have a significant impact on bandwidth requirements.
Disaster-recovery planning shouldn’t have to involve a harsh set of tradeoffs. If users follow some straightforward recommendations and examine replication solutions in-depth before implementing them, then the cost and complexity of the overall solution can be held in check, allowing much more widespread adoption of replication technologies across a broader spectrum of applications and companies.
Brad O’Neill is a senior analyst and consultant at the Taneja Group research and consulting firm ().
InfoStor January, 2007