Been away for some time finishing up on the Upgrade and Application Compatibility testing content and tools. We’re about 98% done with just a few more tweaks before final RTM. If you’re interested in the release candidate versions, check out SQL Server 2008 Upgrade Assistant.
Managed to find some time for an extended lunch so I figured I’d come back and share some other learnings - an easy one this time: backup compression. This is not the same as data compression so don’t confuse the two. I’ll blog on the latter later when I find some time. That’s pretty cool too.
I’ve worked with a lot of customers who use 3rd party backup software primarily for the compression capabilities followed by improved speed, especially during restore. With one of the more popular 3rd party backup solutions, I’ve restored a 1.2 TB database in under 30 minutes on a beefy HP Integrity rx8640 server with very a large EMC Dx3 SAN so there was no IO or CPU bottlenecks at any point. No official hard data on Katmai’s backup compression or performance right now but I know the folks in building 35 will be working hard to at least match the popular 3rd party solutions.
From some of the tests we’ve done both internally and with a couple of customers with tens and hundreds of GB databases, both compression ratio and performance seem to map closely to the popular 3rd party software we used. At least for default compressions on the 3rd party software. You can change the level of compression with some of these 3rd parties but SQL Server 2008 only has 1 level so I didn’t bother testing against different settings. I’d imagine the delta would not be significant percentage wise but if you have tens of Terabytes to deal with, even 5% can be a noteworthy difference.
How it’s done
Taking advantage of backup compression once you’ve upgraded to Katmai is just a matter of adding an extra option (in bold below) in your backup scripts
– My Backup Job with compression
BACKUP DATABASE MyBigDatabase
TO DISK = ‘N:\Tier1Backups\MyBigDatabase.bak’
WITH
NOFORMAT,
NOINIT,
NAME = ‘Full DB Backup - MyBigDatabase’,
SKIP,
NOREWIND,
NOUNLOAD,
COMPRESSION;
GO
or selecting the compression option if you are using Management Studio GUI. Note that you can change the global setting at the instance level with
SP_CONFIGURE ‘backup compression default’, 1;
so that all backup commands will always compress by default. This can be changed at the command level also when you execute your backup so you can backup with compression for specific databases only and not for everything in the instance. This is too easy.
Note that the default setting is ‘no compression’ so your DBA team does not get alarmed by the sudden reduction in the backup file sizes; just in case not everyone is keeping close tabs Katmai enhancements. Last thing you want is the graveyard shift operator re-running backup all night and finally paging you at 3am because “the backup file looks too small so it can’t be good…”. Don’t laugh (unless it’s someone else’s turn to be on-call).
Sample results
Simple tests with the AdventureWorks OLTP database yielded the following:
|
Path
|
Backup time
|
Restore time
|
File size
|
|
Original database
|
n/a
|
n/a
|
174,080 MB
|
|
NO COMPRESSION backup
|
23s
|
23s
|
171,093 MB
|
|
WITH COMPRESSION backup
|
11s
|
14s
|
40,097 MB
|
This was done on a VPC with 1.5GB of dedicated RAM and its own dedicated 7200rpm EIDE disk (8MB cache) on a USB 2 interface. The host machine is a notebook with an Intel Core 2 Duo 2.2GHz CPU (T7500), 4GB RAM and a 7200rpm SATA disk. The times are based on averages from 3 runs - all were with a couple of seconds. Other tests with real customer databases ranging from 5GB to 800GB in size yielded similar levels of compression and percentage elapsed time reduction. Sorry, can’t publish any of the customer data but those tests were run on x64 workstations with local SATA drives and IA64 servers with enterprise class SANs.
Not bad at all. Note the benefits from a performance perspective since you have much less to send to disk or retrieve when restoring. There was some impact on CPU utilization but it’s hard to say what the real impact would be when doing hot backups in a production environment since I don’t have other workloads running. You can expect some CPU load but the level will really depend on your database. Looks like it’ll be low to moderate. So long as you’re not already facing CPU pressure in your current system, I expect this to work fine but don’t take my word for it; test (no really, test). This is important because while few environments do full DB backups during peak hours, many companies have other system jobs or reports run during maintenance windows so make sure you don’t saturate the CPU. All that aside, the numbers are very decent especially since it’s still in beta.
Of course, with any compression technology, database or file system, your mileage can vary significantly. Character data types typically compress very well while binary types not as well (or very poorly in some cases). However, most traditional OLTP systems (e.g. accounting, sales, HR, etc…) will probably benefit from this since the bulk of the data is of character types.I expect this will be a widely used feature both to improve backup/restore performance and to save disk space (enterprise storage is NOT cheap).
One thing to remember is that smaller backup files also benefit some disaster recovery scenarios (basically those that don’t have good DR plans). Companies that do not have a SAN replication solution or some kind of Log Shipping/Database Mirroring solution in place, often transfer backup files to their warm DR site. Smaller files means faster transfer time which means better chance of recovery (you have a smaller window of exposure).
Password lock is available as with regular backups but there is no encryption capability with this release as far as I know. Now if you’re thinking, “hey, I’ve got Transparent Data Encryption, I can mix and match!!“. Well, don’t. Backup compression and TDE, at least based on the current release, do not compliment each other. You get next to nothing in space savings with backup compression on a TDE enabled database but you will burn the extra CPU cycles trying. As far as I can tell, this is true regardless of whether you use SQL Server’s backup compression or some 3rd party backup software with compression. However, some 3rd party solutions do offer backup compression with encryption but that’s a whole different deal. It’s a compressed backup that’s stored encrypted on disk with a non-TDE database.
Chalk this up as one of the “little to no effort” benefits when you move to SQL Server 2008. Just be aware that while it does great at the intended task, it doesn’t have all the bells & whistles that popular common 3rd party solutions do. For example, having different levels of compression and encrypting backups of non-TDE database are the most common I hear from our customers. Got something you want the SQL Server product group to add something for the next release? Go to SQL Server Connect
joe.