26
If I had to pick my favorite feature in Katmai, it would have to be the Resource Governor. This is something I had wanted for a long time for various reasons and was sorely disappointed when it was cut from Yukon. I’m sure Richard & Euan remember the amount of grief I gave them though it wasn’t really them who made the final cut decision (but they did own the feature back then).
Prior to SQL Server 2008 Resource Governor, there really was very little you could do to manage resources within an instance. The Windows Server Resource Manager controls Windows processes so all you’re doing is restricting what the sqlservr.exe process could consume. Max/min memory settings locked things down at the instance level and so did CPU and I/O affinity. There really wasn’t much that you could do to specific users or groups. Sure, there was the query governor but all you can really do is set an “execution time threshold” for queries. I.E. If the estimated cost of a query exceeds the threshold, it won’t be allowed to run. Not bad for preventing very heavy queries from running but not really controlling resources either and the side effect is users might start opening support tickets thinking something’s wrong with the system because they can’t run their reports. Also, if your statistics are out of date and not automatically updated, queries may be incorrectly allowed or disallowed to run.
With Resource Governor, Admins can now set controls on how resources are consumed in SQL Server. At this time, the RG can control CPU and memory. Disk I/O and network I/O are not covered in this release. Still, this is already a giant step forward.
Let’s try a few things. First, the old way using the Query Governor cost limit. You can set this at the server instance (sp_configure) or on a per connection basis (set query_governor_cost_limit). If your goal is really to prevent long running queries during peak hours, you probably set this at the instance level. After all, how many users do you know will voluntarily not run their heavy queries?
If you’re wondering what self-respecting DBA would allow reporting, DSS and other BI type queries run against a production OLTP system during business hours, well, DBAs don’t write the checks or run the business. It is reality today that the business users want/need to analyze data rapidly, not after a day, week or month. While you can create separate reporting servers, there are some genuine needs where a reporting server’s latency is beyond what the business users require. It’s the new world my fellow traditional DBAs - embrace and control it!
Now, let’s try this example. The following script (written by my colleague Suresh) scans the order tables in the Sales schema in the Adventureworks database and offers a discount for purchase quantities greater than 3. A cursor is used to scroll through and evaluate each row, taking action where appropriate. This query takes anywhere from 20-25 minutes to run on a workstation consuming a notable amount of CPU and memory.
USE AdventureWorks;
GODECLARE @l_SalesOrderID INT
SET NOCOUNT ON
DECLARE update_cursor SCROLL CURSOR
FOR
SELECT SalesOrderID FROM Sales.SalesOrderHeaderOPEN update_cursor
FETCH NEXT FROM update_cursor INTO @l_SalesOrderID
WHILE (@@fetch_status = 0)
BEGIN
UPDATE Sales.SalesOrderDetail
SET UnitPriceDiscount = 0.01
WHERE SalesOrderID = @l_SalesOrderID
AND OrderQty >= 3
FETCH NEXT FROM update_cursor INTO @l_SalesOrderID
ENDCLOSE update_cursor
DEALLOCATE update_cursor
SET NOCOUNT OFF
GO
SQL Server 2005 (and earlier)
With SQL Server 2005 and earlier, you would do one of the following:
Instance wide: sp_configure ‘query governor cost limit’, 300
Per connection: SET QUERY_GOVERNOR_COST_LIMIT 300
Either of these will evaluate queries and disallow execution if they exceed 300 seconds. If you tried to execute the sample query from above, you will get a message that looks like:
Error 8649, Severity 17, State 1, Procedure iduSalesOrderDetail, Line 33
The query has been canceled because the estimated cost of this query (1157) exceeds the configured threshold of 300. Contact the system administrator.
Msg 3609, Level 16, State 1, Line 14
The transaction ended in the trigger. The batch has been aborted.
So that accomplishes your goal of not letting heavy queries execute. These settings can be changed at any time so you can setup SQL Agent to set the appropriate value at different times of the day. However, the big downside here is that your “heavy” queries do not get executed at all. This probably won’t go down very well with your end users. How about letting the query run but not allowing it to dominate server resources?
With SQL Server 2008, you have MUCH better control and can keep both regular OLTP and batch/reporting users happy.
There are a few terms we need to be familiar with before diving into Resource Governor. The following are brief descriptions of key terms. BOL has the gory details which I encourage you to check out.
- Workload group: a logical way to classify different types of workloads you want to govern
- Internal group: a “special” group created by and for SQL Server for system processes. This is not user configurable.
- Default group: another special group created by SQL Server to be the catch-all bucket for any workload that doesn’t get allocated into a user defined group for whatever reason.
- Classification function: a UDF that determines with group a submitted workload belongs to
- Resource pools: a logical container for the amount of CPU and memory that members of the pool can consume
- Internal pool: a “special” resource pool created by and for SQL Server to be consumed by system processes (in the internal group). This is not user configurable and will always have higher priority than all other pools.
- Default pool: another special resource pool created by SQL Server to be the catch-all bucket for any group (stars with the default group) that doesn’t get allocated into other user defined pools. While it cannot be dropped, its configuration can be altered.
Getting started with Resource Governor.
Before we can start bossing workloads around with RG, there are a few things we need to consider.
Rules and environment prep:
- How many logical workgroups do you have in your database? E.g. Regular OLTP users, reports users, batch processors, administrators, security team, etc…. Can you standardize across your organization?
- What is the typical CPU and RAM utilization of your server? While limits can be adjusted easily, knowing the typical utilization will help avoid setting initial limits that are too low and causing performance problems.
- Is there a order of priority among your users?
- Is there an SLA (implicit or written) on query response time?
Once you figure out what you need from a user and administration perspective, you can then start implementing the controls.
Basic setup:
- Create workload groups in your environment that you want to have specific resource allocations for (others will fall into the default group)
- Create a classification function to place workloads in appropriate groups
- Register the classification function with Resource Governor
- Enable Resource Governor
The syntax itself is easy enough so I won’t elaborate here. The example below uses explicit transactions but it does not have to be that way. However, this is a recommended practice; in case you make a mistake, it is easily undone (rolled back).
First, let’s create the buckets (known as groups) that we want to place our users into:
BEGIN TRAN
CREATE WORKLOAD GROUP ResGroupAdhoc
CREATE WORKLOAD GROUP ResGroupReport
CREATE WORKLOAD GROUP ResGroupAdmin
CREATE WORKLOAD GROUP ResGroupExecs
Next we need to create a classification function. This is the “filter” that decides which group each query submitted to SQL Server belongs to. Note that any workload that does not get classified by your function goes into the ‘DEFAULT’ group
CREATE FUNCTION ClassifierFn_Basic() RETURNS SYSNAME WITH SCHEMABINDING AS
BEGIN
DECLARE @ResGroup AS SYSNAME
IF (SUSER_NAME() = ’sa’) OR (SUSER_NAME() = ‘REDMOND\v-jyong’)
SET @ResGroup = ‘ResGroupAdmin’
IF (SUSER_NAME() = ‘BillG’) OR (SUSER_NAME() = ‘CORP\DaBoss’)
SET @ResGroup = ‘ResGroupExecs’
IF (APP_NAME() LIKE ‘%Management Studio%’) OR (APP_NAME() LIKE ‘%SQL Query Analyzer%’) OR
(APP_NAME() LIKE ‘SQLCMD’)
SET @ResGroup = ‘ResGroupAdhoc’
IF (APP_NAME() LIKE ‘%REPORT SERVER%’) OR (APP_NAME() LIKE ‘%Microsoft Office%’)
SET @ResGroup = ‘ResGroupReport’
RETURN @ResGroup
ENDCOMMIT TRAN
GO
If you aren’t sure what App_Name to filter with, run Profiler for a while just to capture the ApplicationName column.
You can set classification conditions by a number of different connection properties including user name (both SQL Server and domain users), application name, hostname (source machine name), server role, etc… Check out BOL for details.
This function is just a sample of what you can do to place each user connection or application into specific groups. This is how you classify/categorize your users and it is with these groups that you will set resource utilization controls on later.
Next, we need to register the function with Resource Governor and enable RG.
ALTER RESOURCE GOVERNOR WITH (CLASSIFIER_FUNCTION=dbo.ClassifierFn_Basic)
GO
Now that we have the function registered and RG enabled, we can set resource usage allocations for each group that we created.
First thing we want to do is to limit the amount of CPU resources ad hoc queries can consume. Let’s say we do not want ad hoc queries to consume more than 20% of CPU cycles and no more than 15% of available memory if there are other workloads running. To do this, we need to create a resource pool, place the ad hoc group in that resource pool and set limits on that pool.
CREATE RESOURCE POOL ResPoolAdhoc WITH (MAX_CPU_PERCENT = 20,
MAX_MEMORY_PERCENT = 15)
Next, we will assign the Adhoc group to the resource pool we just created so it will be governed by the policies we set for that pool. We then need to notify Resource Governor (kinda like what we already do when making a change with sp_configure).
ALTER WORKLOAD GROUP ResGroupAdhoc USING ResPoolAdhoc
GO
ALTER RESOURCE GOVERNOR RECONFIGURE
GO
Now all queries that are submitted via SQL Server Management Studio, Query Analyzer or SQLCMD (from a command prompt) will not be allowed to consume more than 20% CPU time or 15% of available memory if there are other workloads running. If the server isn’t very busy, RG is smart enough to let the queries use free cycles. For more details on how it works, check out BOL.
You can now repeat this process for other groups to set the max and/or min values for each group by creating and assigning resource pools for the groups. Try it out by submitting the long running query at the start of this article but use different UserIDs that belong to different resource groups. Run several instances of them and you’ll see the effect of the Resource Governor. It becomes very obvious when you run a lot of instances of that query.
Note that if you forget to assign a particular group to a resource pool, it will remain in the default pool. New applications or applications that are not defined in your function will fall in the default group which also falls into the default pool. (For example: another team may decide to use Toad as their query tool instead of Query Analyzer). That’s why it is always a good idea to set the default pool limits to values that you are comfortable with so that queries that are not caught by your classifier function will not hog server resources.
A few other things you should probably be aware of:
- Dedicated Admin Connection runs in the “internal group”. That means you do not need to place it in a group to ensure it will always be able to run. SQL Server already takes care of that for you.
- Don’t create vastly complicated classifier functions. All connections will be filtered through the function so if you have some convoluted code that performs 27000 checks, the connection may timeout before getting clasffied.
- Multiple groups can belong to the same pool (e.g. default pool) but you can still impose some priority for allocation within that pool using the “IMPORTANCE” setting (LOW, MED, HIGH).
- You can do all this in Management Studio, it doesn’t have to be via TSQL
Ok, so this is quite a bit more involved than the Query Governor Cost Limit in SQL Server 2005 and earlier but it also provides a lot more control and flexibility. This is one of those features that requires some work to take advantage of but it really isn’t much work plus the returns are huge. We covered the basics here which should enough to get you started. There are quite a few more things you can do with RG including setting up filters that just log events when queries run for a specific period (want an easy way to track offenders with long running queries?). BOL documents this pretty well and look for whitepapers (to come) on www.microsoft.com/sql/
So what should you do after upgrading to 2008 if you currently use Query Governor Cost Limit? I suggest creating the appropriate rules in Resource Governor (sorry, no such thing as an upgrade/migrate for this) starting with a close to 1-1 match for what you had before. Get used to its behavior then expand with more rules. This will work for probably >95% of the users of the Query Governor Cost Limit. There are some scenarios where you truly do not want a query to run if it is expected to last longer than x minutes. In which case, you can keep both; they are compatible.
Btw, if you’re wondering about how to monitor activity and how the allocations are actually being used, monitoring data is exposed via System Monitor (aka perfmon) and DMVs. BOL documents these well or you could just do what most DBAs do; fire it up and look around.
Beyond workload governance
Performance management and runaway queries aren’t the only things the Resource Governor is good for. With the ability to set resource utilization limits within a database, it becomes harder for malicious acts against your database such as a DDOS attack to cause widespread problems. It also provides greater flexibility and resource usage efficiency in database consolidation. There’s a good reason why this one feature is cited across multiple scenarios in the 2008 release. It really is super handy. Now as soon as we get network and disk IO added to the resources we can govern, DBAs will rule the world (more than before).
joe yong

