VSS diagnostics

For the past eight month, I’ve been working with EMC and Microsoft to diagnose a problem. Several time a month, during the backup of our primary Windows 2008 R2 file server, all the VSS shadow copies get deleted for the volume containing all our shared departmental directories.

This has two major effects. First, it means that our clients no longer can recover files using the Previous Versions feature of Windows. Second, it casts significant doubt on the validity of the backups performed at that time, which EMC NetWorker reports as having completed successfully.

We have been unable to find a technical solution to the shadow copy loss, so we will be reconfiguring our storage and shared directories to accommodate the limitations of NetWorker. In the meantime, I want to note a few of resources that have been helpful in diagnosing problems with VSS (it will be easier to find them here than in my pile o’ email):

Volume Shadow Copy Service (TechNet)

Volume Shadow Copy Service (MSDN)

Registry Keys and Values for Backup and Restore

How to enable the Volume Shadow Copy service’s debug tracing features in Microsoft Windows Server 2003 and Windows 2008

Using Tracing Tools with VSS

backups – Bad and Good

We’ve been working with our backup vendor to address some shortcomings of their product as it relates to Windows 2008 system recover. This was precipitated by a failure of a portion of our virtual infrastructure, which lead to corruption of several hosts’ virtual disk files.

We managed to rebuild one failed host from bare (virtual) metal, because EMC Networker could not recover the system from backups. For Server 2008 systems, they require backups made with client 7.5.1 and restored with 7.5.1 and you have to enable/install any server role that was present on the original system before performing the restore.

We’ve been working on other ways to make sure we can recover from a system failure. Greg has successfully scripted using server 2008’s printer management scripts to dump printer info to files. I’ve been working on scripted backup of SharePoint Site collections. I got some help from Microsoft in determining the correct permissions needed for a service account to perform STSADM backup operations, which has been a thorny issue. ( see KB896148 )