I was asked by a fellow twitter user to give my thoughts on best
practices for virtual machine (VM) backups. I didn't know anything about
this persons company or product, so these are just reflections of my
own opinion.
I would expect my ideal backup solution to take a snapshot using the vSphere API to protect the VM disk files (vmdk, vmx, etc) and create a process from that snapshot inherit to vSphere. Why?
- Agents are dead to me. When I deploy a new server, I don't want to worry about having to install another piece of software.
- Agent updates with every new version. When there is
an update to the backup software, there is usually an update to the
agent on every server. I don't care if it's all centrally managed, it's a
task that I don't enjoy, especially when a reboot is needed.
- Agents have issues with locked files.
Traditionally, an agent within a system won't be able to backup any
files that are locked and the file is skipped during the backup process.
- Agents spike CPU usage. Agents within a VM have to
crawl the machine and touch every single file. Whether it's to back up a
file, check an md5sum hash to see if a file has changed, etc. vSphere
snapshots takes a lot of the grunt work out and saves your host CPU
power.
- Full VM Recovery. Using the vSphere API, taking a
snapshot and copying that place in time to a different storage location
allows me to completely restore a whole virtual machine or even its
configuration files if necessary. This allows you to do file level
recovery and dig into the VMDK offline as well.
- Easy Replication on a VM basis. By utilizing the vSphere API snapshot process, it makes replicating those disk files to an off-site location much easier.
Instead of using the traditional SAN array replication where it moves
everything inside of a LUN, I can now pick and choose the VMs I want to
replicate.
- Changed Block Tracking. One of the best things
vSphere had to offer in terms of backup was change block tracking. Using
vSphere's native method, it allows backup vendors to utilize this
technology to only backup the changed blocks, resulting in far less disk
consumed, faster backup times, and shorter windows.
- Quiesce the file system. By telling VMware tools to
quiesce the file system during a backup, it will push anything that is
currently running in RAM of a VM and write that to disk so you can take a
*true* backup of the server.
- VSS support. Products such as have VSS support for Windows VMs to have crash consistent backups.
- Utilize de-dupe. Products such as
have an in-line de-duplication method to de-dupe the blocks during the
backup, again, resulting in far less disk space consumed and backup
times.
Currently, as you can tell, I'm a fan for my virtual backup solution. Other virtual backup products that utilize most of these points listed above are and VMware's own which is a component of most licensing packages.
The
only problem with these solutions is that if you don't have all your
servers virtualized, then you can't use these backup methods. Plain and
simple, products built around the vSphere API can't backup your physical
servers. Therefore, you have 2 backup solutions in-house until you can
get all your servers virtualized. Currently, we're sitting at 85%
virtualized and hoping to take it to 100% by the end of this year.
(*fingers crossed*)
I would say my opinion is very strong when it
comes to a subject matter such as this. Virtualization has changed the
game and you have to evaluate everything you're doing in the datacenter.
Legacy backup solutions using agents aren't going to be the standard
moving forward. Backup vendors were the first to see this change and
they are already moving forward in the process with new features. The
next big run of applications to go agent-less are security. Having an
agent on every VM to do automated scanning for viruses, updates,
scanning for threats, etc, will be dead. Using the
vSphere VMsafe API,
packet inspection can all be done without any interaction from the VM.
There are very limited amount of providers for this solution at the
moment, but once the technology matures, it will become a norm in every
datacenter.
阅读(982) | 评论(0) | 转发(0) |