Been away for some time finishing up on the Upgrade and Application Compatibility testing content and tools. We’re about 98% done with just a few more tweaks before final RTM. If you’re interested in the release candidate versions, check out SQL Server 2008 Upgrade Assistant.

Managed to find some time for an extended lunch so I figured I’d come back and share some other learnings - an easy one this time: backup compression. This is not the same as data compression so don’t confuse the two. I’ll blog how to on the latter later when I find some time. That’s pretty cool too.

I’ve worked with a lot of customers who use 3rd party backup software primarily for the compression capabilities followed by improved speed, especially during restore. With one of the more popular 3rd party backup solutions, I’ve restored a 1.2 TB database in under 30 minutes on a beefy HP Integrity rx8640 server with very a large EMC Dx3 SAN so there was no IO or CPU bottlenecks at any point. No official hard data on Katmai’s backup compression or performance right now but I know the folks in building 35 will be working hard to at least match the popular 3rd party solutions.

From some of the tests we’ve done both internally and with a couple of customers with tens and hundreds of GB databases, both compression ratio and performance seem to map closely to the popular 3rd party software we used. At least for default compressions on the 3rd party software. You can change the level of compression with some of these 3rd parties but SQL Server 2008 only has 1 level so I didn’t bother testing against different settings. I’d imagine the delta would not be significant percentage wise but if you have tens of Terabytes to deal with, even 5% can be a noteworthy difference.

How it’s done

Taking advantage of backup compression once you’ve upgraded to Katmai is just a matter of adding an extra option (in bold below) in your backup scripts 

– My Backup Job with compression 

BACKUP DATABASE MyBigDatabase
TO  DISK = ‘N:\Tier1Backups\MyBigDatabase.bak’
WITH
NOFORMAT,
NOINIT,
NAME = ‘Full DB Backup - MyBigDatabase’,
SKIP,
NOREWIND,
NOUNLOAD,
COMPRESSION;
GO

or selecting the compression option if you are using Management Studio GUI.  Note that you can change the global setting at the instance level with

SP_CONFIGURE ‘backup compression default’, 1;

so that all backup commands will always compress by default. This can be changed at the command level also when you execute your backup so you can backup with compression for specific databases only and not for everything in the instance. This is too easy.

Note that the default setting is ‘no compression’ so your DBA team does not get alarmed by the sudden reduction in the backup file sizes; just in case not everyone is keeping close tabs Katmai enhancements. Last thing you want is the graveyard shift operator re-running backup all night and finally paging you at 3am because “the backup file looks too small so it can’t be good…”. Don’t laugh (unless it’s someone else’s turn to be on-call).

Sample results

Simple tests with the AdventureWorks OLTP database yielded the following:

Path

Backup time

Restore time

File size

Original database

n/a

n/a

174,080 MB

NO COMPRESSION backup

23s

23s

171,093 MB

WITH COMPRESSION backup

11s

14s

40,097 MB

This was done on a VPC with 1.5GB of dedicated RAM and its own dedicated 7200rpm EIDE disk (8MB cache) on a USB 2 interface. The host machine is a notebook with an Intel Core 2 Duo 2.2GHz CPU (T7500), 4GB RAM and a 7200rpm SATA disk. The times are based on averages from 3 runs - all were with a couple of seconds. Other tests with real customer databases ranging from 5GB to 800GB in size yielded similar levels of compression and percentage elapsed time reduction. Sorry, can’t publish any of the customer data but those tests were run on x64 workstations with local SATA drives and IA64 servers with enterprise class SANs.

Not bad at all. Note the benefits from a performance perspective since you have much less to send to disk or retrieve when restoring. There was some impact on CPU utilization but it’s hard to say what the real impact would be when doing hot backups in a production environment since I don’t have other workloads running. You can expect some CPU load but the level will really depend on your database. Looks like it’ll be low to moderate. So long as you’re not already facing CPU pressure in your current system, I expect this to work fine but don’t take my word for it; test (no really, test). This is important because while few environments do full DB backups during peak hours, many companies have other system jobs or reports run during maintenance windows so make sure you don’t saturate the CPU. All that aside, the numbers are very decent especially since it’s still in beta.

Of course, with any compression technology, database or file system, your mileage can vary significantly. Character data types typically compress very well while binary types not as well (or very poorly in some cases). However, most traditional OLTP systems (e.g. accounting, sales, HR, etc…) will probably benefit from this since the bulk of the data is of character types.I expect this will be a widely used feature both to improve backup/restore performance and to save disk space (enterprise storage is NOT cheap).

One thing to remember is that smaller backup files also benefit some disaster recovery scenarios (basically those that don’t have good DR plans). Companies that do not have a SAN replication solution or some kind of Log Shipping/Database Mirroring solution in place, often transfer backup files to their warm DR site. Smaller files means faster transfer time which means better chance of recovery (you have a smaller window of exposure).

Password lock is available as with regular backups but there is no encryption capability with this release as far as I know. Now if you’re thinking, “hey, I’ve got Transparent Data Encryption, I can mix and match!!“. Well, don’t. Backup compression and TDE, at least based on the current release, do not compliment each other. You get next to nothing in space savings with backup compression on a TDE enabled database but you will burn the extra CPU cycles trying. As far as I can tell, this is true regardless of whether you use SQL Server’s backup compression or some 3rd party backup software with compression. However, some 3rd party solutions do offer backup compression with encryption but that’s a whole different deal. It’s a compressed backup that’s stored encrypted on disk with a non-TDE database.

Chalk this up as one of the “little to no effort” benefits when you move to SQL Server 2008. Just be aware that while it does great at the intended task, it doesn’t have all the bells & whistles that popular common 3rd party solutions do. For example, having different levels of compression and encrypting backups of non-TDE database are the most common I hear from our customers. Got something you want the SQL Server product group to add something for the next release? Go to SQL Server Connect

joe.

Continuing from my previous post on [URL=http://blog.scalabilityexperts.com/?p=79]DMF[/URL], let’s look at several things you probably do today either manually or by writing some creative scripts or little apps to accomplish. We’ll start with the very basic stuff and then look at something a little more “interesting”.

Before we get started, there are a few DMF specific terms you need to be familiar with that are new in SQL Server 2008. Policies we all get (perhaps too well). Then there are targets. As the word implies, that’s what you want to apply your policies to. Targets can be a SQL Server instance, database, table, etc… Within the targets, you have different facets that you can apply one or more policies to. For instance a database target can have options (sp_db_options is deprecated btw), table, etc… You can set a naming policy for tables or set auto update stats on policy for the database. You might have noticed by now there’s a tree hierachy to the targets but we’re database guys, so we get this (or think we do). Anyway, other things to be aware of are conditions and execution modes. Conditions are just that; conditions you define for a facet. Execution modes determine how your policies are executed and enforced (e.g. passively as in log the transgression or actively where non-compliant activities are disallowed). This is just the elevator brief on the terms. Do look them up in BOL for detailed definitions and also for a few other related subjects not covered here.

Background:
Let’s say you’re new to the company/department (previous DBA just stopped showing up for work one day and won’t answer calls/emails, nobody knows why) and you’re the only “real” DBA in the company/department. You quickly discover that everyone and their pet iguanas know the SA password and you cannot change it or take it back because the developers have test harnesses with the password hardcoded and most if not all production apps use SA for one or more tasks. Well your boss wants you to clean up shop.

So your first few months are spent chasing after one performance issue after another along with the usuals like backups, reindexing, space allocations, etc… Somethings you script out and use SQL Agent to run periodically, others you write DDL triggers for. With servers running 2005 SP2, you even find time (between 2-5am) to write custom reports with colorful charts. Finally, after months of hard work and some help from a guy named Johnnie W. (the one with the blue ribbon, not the green one), you convinced your CIO to upgrade to SQL Server 2008. Now you’ve got some serious tools to work with…. :)

Scenario 1: Database options policy
Starting small, you decide to set standard database options. You figured by setting auto stats management back on will solve 60% of performance tickets because someone had read somewhere that stats management can burn server resources and decided to turn it off in the interst of performance. Of course, that someone also doesn’t know what the stats are for and calls you regularly to complain about slow queries. And then you have 95% of the log files autogrowing at 10% because the default values have to be the best, right? Well, you’d rather create a big enough file and let it auto grow by 25MB increments. This should be enough for a start.

You fire up Management Studio and browse to Management in the object explorer. You then expand Facet and create the appropriate new conditions for the Database Option and Log File facets.  

DMF_Policy

Next, you create your first policy by right-clicking on Policies and select the conditions you want to apply. In this case, it’s the Database Options policy (use whatever name you assigned earlier). You’re even kind enough to provide an informational message in the description and a link to an intranet site for more information. You schedule this to run weekly because you know your team tends to get careless on Monday mornings before their second cup of coffee. When done, you repeat this for the Log Files condition.

DMF_Facets

All this is done easily via SSMS and the screenshots show a couple of the steps. Books Online documents this very clearly and even provides some examples. Do check it out when you get a chance.

Now, what is not documented are the actual commands executed to create the conditions and policies. The current cTP build doesn’t even have a “Script” button for you to script everything out like other tasks in SSMS. BOL doesn’t mention anything about what commands are used either. A simple trace reveals the use of system SPs in msdb such as msdb.dbo.sp_syspolicy_add_condition to create the condition. It takes in several parameters including name, description and an expression which is basically the condition you define but in XML-like format. Exporting the policies also yields a similar form. Not sure what the release plans are but according to Buck Woody (SQL Server PM who delivered a DMF session at SQLPASS), the product group does intend to provide scripting support so check out future CTP releases as they become available.

Regardless, this will take you a whole 2-3 minutes to create and maybe a few more to push out to your existing targets which can include new databases that will be created later. In less than 15 minutes after upgrading to Katmai, you’re done with chasing after databases every week to ensure they have the right database options. Not bad at all.

Though useful this is too easy a scenario. Next, we’ll look at a more complex example where you can have multiple conditions (including differentiating between 32-bit and 64-bit platforms) and policies that depend on other policies. 

 joe yong

Most new product releases come with a slew of new features along with some new ways of doing the same thing. Well, Katmai introduces a “somewhat” new way to manage your SQL Server systems. I say somewhat because the concept isn’t really new but it is now being supported by enhancements in the management tools. Also, in addition to changes in the tools, Microsoft has also come up with guidance on how you should manage your SQL Servers. Welcome to the Declarative Management Framework (DMF). Btw, there are at least a couple of places in the July CTP BOL that incorrectly calls it the “Dynamic” Management Framework; it really should be Declarative. At least that’s what it was called at the last major public event - PASS Summit 2007 (Denver).

So what’s DMF? In a nutshell, it’s an approach to database administration based on policies and not tasks. Many DBAs plan their work around the various tasks they need to perform which may include backing up databases, reviewing event logs, scanning for improper/unauthorized object creations, killing long running query SPIDs, etc… The list goes on for quite a bit (all the way to Christchurch my Kiwi friends say but my OZ buddies assure me there really isn’t much East of Wollongong). Bottom line is, there are a lot of tactical and sometimes reactionary items on the list which can keep the conscientious DBA busy pretty much all year around (not even counting the support calls).

With DMF, you’re suppose to determine what policies, restrictions, behaviours, etc…. the SQL Servers in your organization will enforce, define the policy using Management Studio and select one or more servers to enforce the policy. You then monitor from a central console; SSMS. The classic example (if you paid attention at TechEd or PASS Summit this year), is users are not allowed to create tables in the DBO schema. That’s handy but not particularly interesting. How about making sure all user created stored procedures begin with a USP_ prefix? Disallow the use of SQLMail? No OPENROWSET queries while we’re at it. Notice that you can define policies that have a database or a server scope? Pretty cool huh? Sure, you can change the relevant sp_configure settings for each server and db_options. You can even put all that in a script and run them against all the databases/servers in your org. Well, DMF and SSMS in Katmai makes all that a lot easier. It’s not reinventing the wheel, it’s just putting on some good tires.

Though not stated by Microsoft, it really looks to me that they are trying to drive the whole “manage by exception” philosophy, which I subscribe to. DBAs should not be repeating mundane tasks day after day. These can and should be automated. However, they should not operate in a blackbox either. The DBA should have full view and control over what’s going on and if an “exception” occurs, the DBA should be notified. Whether automatic corrective actions are taken or not, that depends on the nature of the event.

So what does that have to do with ugprading to Katmai? Well, it’s a classic example of a product upgrade that requires a DBA upgrade (Side-by-side is best). Sure, you can continue to manage things the old way but if you are working in a dynamic environment that continues to grow, you’ll soon find yourself overwhelmed (unless your management is kind enough to add new headcount your team every year, no?).

This is a pretty broad topic and an important one too given the number of companies that have more than a dozen SQL Server instances. So instead of writing another 5 pages on DMF and how it works with Management Studio, I’ll stop here with the intro.

Next up will be several scenarios with DMF and details on how they are implement.

Btw, if you are currently facing challenges managing multiple SQL Servers in your company, drop me a note and we’ll see if/how DMF and enhancements in Katmai SSMS helps.

joe yong.