PRPASS Home
  PASS Summit Unite 2009
The Puerto Rico's Professional Association of SQL Server 
Skip Navigation LinksHome : SQL World
  
Skip Navigation Links.     
Membership Info

Latest: morale00
New Today: 0
New Yesterday: 0
Overall: 418
People Online

Visitors: 3
Forum Last Activity

alan_koo posted at 1/7/2010 4:40:00 PM
 
SQL World

This section brings to you the best live content of the SQL around the World. Select the category and source from the topics that you are looking for and enjoy it.

 Category:     Source:
From MSQL Development Customer Advisory Team
Identifying query compile/runtime parameter values using XML SHOWPLAN output.
Published: Wed, 09 Jun 2010 18:50:51 GMT


I recently returned from Convergence 2010; the annual North American Microsoft Dynamics User Conference in Atlanta. It was a really great conference with many Customer, Partner, and Microsoft presentations and success stories. It was inspiring to see what Microsoft Dynamics products are being used for in the real world. I presented a number of sessions focused on Dynamics/SQL performance, and it became very apparent to me that XML SHOWPLAN, a highly useful feature of SQL Server, is not well understood and not being used to identify a common issue related to parameter sniffing. You can find more info regarding parameter sniffing here: (http://blogs.msdn.com/queryoptteam/archive/2006/03/31/565991.aspx)

Basically, the issue revolves around the fact that SQL Server caches the query plan that was first sent to SQL Server and compiled. This helps performance by not recompiling the same query each time it is issued. However, this also means that the execution plans get created taking into account the very first parameter values that were passed to the query. In most cases this works fine, however there are other cases where this has known to cause problems, as the example below will show.

Newer versions of Microsoft Dynamics NAV have solved the parameter sniffing issue by using the new-in SQL Server 2008 ‘OPTIMIZE for UNKNOWN’ hint, which I previous blogged about here: (http://blogs.msdn.com/mssqlisv/archive/2008/11/26/optimize-for-unknown-a-little-known-sql-server-2008-feature.aspx), however I believe that understanding what to look for in the XML SHOWPLAN output helps to better understand and diagnose problems related to cached parameterized query reuse.

NOTE: the information presented below is not specific to Microsoft Dynamics products and is applicable to any SQL DBA that deals with parameterized queries.

The SQL Server optimizer will create query plan specific output in XML that is quite detailed and it contains many more details about the query plan that are not exposed in the simpler SHOWPLAN_ALL or SHOWPLAN_TEXT output, in particular:

<ParameterList>

<ColumnReference Column="@P5" ParameterCompiledValue="'1753-01-01 00:00:00.000'" ParameterRuntimeValue="'2010-04-01 11:01:05.000'" />

As you might imagine, the data above relates to the parameter [Column="@P5"] used to compile a given query execution plan, the value that was used to originally compile the plan [ParameterCompiledValue="'1753… ] and the parameter value that was used during this invocation of the query [ParameterRuntimeValue="'2010… ].

XML SHOWPLAN is fully documented here: (http://msdn.microsoft.com/en-us/library/ms187757.aspx)

Let’s have look at a few examples to see how to use this functionality. I am using the same basic data presented in the ‘OPTIMIZE for UNKNOWN’ blog listed above:

(http://blogs.msdn.com/mssqlisv/archive/2008/11/26/optimize-for-unknown-a-little-known-sql-server-2008-feature.aspx).

The scenario is: I am a DBA interested in understanding why on some days a query runs great, but on other days it runs much slower. In many cases, this is caused by query execution plans being compiled and cached with non-optimal initial parameter values. The outcome of this phenomenon is that SQL Server will compile and cache a ‘good’ plan for the first parameter values. Unfortunately, this can have the unintended side-effect of caching a poor execution plan for all subsequent parameter values. To make this clearer let’s look at the following example;

CREATE PROCEDURE AllFromT

@p1 int,

@p2 int

AS

SELECT * FROM t WHERE col1 > @p1 or col2 > @p2 ORDER BY col1

GO

Let’s assume, for simplicities sake, that the table consists of three columns (col1, col2, col3), col1 is unique, col2 has 1000 distinct values, col3 contains a single default value, there are 10,000,000 rows in the table, the clustered index consists of col1, and a nonclustered index exists on col2.

Imagine the query execution plan created for the following initially passed parameters: @P1= 1 @P2=99

These values would result in an optimal queryplan for the statement contained in the stored procedure using the substituted parameters:

SELECT * FROM t WHERE col1 > 1 or col2 > 99 ORDER BY col1

Now, imagine the query execution plan if the initial parameter values were: @P1 = 9890000 and @P2 = 550.

Likewise, an optimal queryplan would be created after substituting the passed parameters:

SELECT * FROM t WHERE col1 > 9890000 or col2 > 550 ORDER BY col1

These two identical parameterized SQL Statements would potentially create and cache very different execution plans due to the difference of the initially passed parameter values. Both query plans are valid (and in fact; good) for the parameter values that were passed, but may not be optimal for subsequent invocations of the query. This can be a particular problem if the initially passed parameters are such that the query returns no results.

The ways to work-around the issue are described my previously listed blog; the examples below help you detect whether this is the issue by identifying the initial and subsequently passed parameter values.

The test below does the following:

· Creates the table, indexes, and data

· Creates a stored procedure that contains a parameterized query

· Runs the stored procedure using different parameter values

· Has you create an ‘Execution Plan.xml’ file – pay attention to the differences in:

<ParameterList>

<ColumnReference Column="@p2" ParameterCompiledValue="(589)" ParameterRuntimeValue="(1000)" />

<ColumnReference Column="@p1" ParameterCompiledValue="(500000)" ParameterRuntimeValue="(1670000)" />

</ParameterList>

· Clears the Procedure cache

· Runs the stored procedure using the same parameter values as above but in different order

· Has you create an ‘Execution Plan.xml’ file

This example clearly shows how different parameter values can cause vastly different plans to be cached and by looking at and understanding the XML output you can make better decisions about whether parameter sniffing is a source of query performance issues in your databases.

NOTE: the XML output can also be created using various SQL Server Trace events, as well as looking in the SQL Server DMV’s.

TEST:

It’s best to run each step below individually (that is; between the comments). This repro has 1,000,000 rows in the table not 10,000,000 as described above, however the same end-results are demonstrated.

--create the table and the data

--!!!!Please be aware that the data creation

--!!!!may take more than a few minutes

--On my laptop it took approximately 20 minutes to create the data

CREATE TABLE t (col1 int, col2 int, col3 int)

GO

DECLARE @i INT

DECLARE @i2 INT

DECLARE @id INT

DECLARE @id2 INT

SET @i = 1

SET @i2 = 0

SET @id = 0

SET @id2 = 0

WHILE @i < 1001

BEGIN

SET @id2 = @id2 + 1

WHILE @i2 < 1000

BEGIN

SET @id = @id + 1

SET @i2 = @i2 + 1

INSERT INTO t VALUES (@id,@id2,73)

END

SET @i = @i + 1

SET @i2 = 0

END

GO

--create the indexes

CREATE CLUSTERED INDEX clind ON t(col1)

GO

CREATE INDEX ind1 ON t(col2)

GO

--create the stored procedure

CREATE PROCEDURE AllFromT

@p1 int,

@p2 int

AS

SELECT * FROM t WHERE col1 > @p1 or col2 > @p2 ORDER BY col1

GO

--!!!! Make sure you turn on ‘Include actual execution plan’

--!!!! before running the stored procedure

--run the stored procedure with non-typical values

EXEC AllFromT 1670000,1000

--Now look at the actual execution plan it should look like:

clip_image002

--This is a great plan for these parameter values

--Now run the stored procedure with more-typical values

EXEC AllFromT 500000,589

--Now look at the actual execution plan,

--it will be the same as above because it is using the cached plan

--However this time:

--Right click on the execution plan

--Now click on 'Show Execution Plan XML...'

--A new tab will open called 'Execution Plan.xml'

--In here look for the following:

<ParameterList>

<ColumnReference Column="@p2" ParameterCompiledValue="(1000)" ParameterRuntimeValue="(589)" />

<ColumnReference Column="@p1" ParameterCompiledValue="(1670000)" ParameterRuntimeValue="(500000)" />

</ParameterList>

--Another thing to look at, which is exposed in the actual showplan details

--is the detail regarding how many rows the optimizer thought it would

--retrieve vs. the actual number it did retrieve.

--in the case above because the plan was created for non typical values

--the estimated number of rows and actual numeber are way off, that is

--often an indication that this query plan is not optimal and could be

--improved by a recompile, or plan hint

--look for the following values:

EstimateRows and the matching ActualRows

--looking at the XML output for the second query above you will see

-- EstimateRows = "2" or less, but the matching ActualRows are much higher

<RelOp AvgRowSize="19" EstimateCPU="2E-07" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimateRows="2" LogicalOp="Concatenation" NodeId="3" Parallel="false" PhysicalOp="Concatenation" EstimatedTotalSubtreeCost="0.0065664">

<OutputList>

<ColumnReference Column="Uniq1002" />

<ColumnReference Database="[master]" Schema="[dbo]" Table="[t]" Column="col1" />

<ColumnReference Database="[master]" Schema="[dbo]" Table="[t]" Column="col2" />

</OutputList>

<RunTimeInformation>

<RunTimeCountersPerThread Thread="0" ActualRows="911000" ActualEndOfScans="1" ActualExecutions="1" />

</RunTimeInformation>

--Now let’s reverse the order, but first lets clear the procedure cache

DBCC FREEPROCCACHE

GO

EXEC AllFromT 500000,589

--Now look at the actual execution plan, it should be very different

clip_image004

--Note that the optimizer determined that it is ‘cheaper’ to do a table scan

--based on the passed parameter values. Based on the number of ActualRows read

--in the second example above, this seems like a good choice

--Now run the stored procedure with non-typical values

EXEC AllFromT 1670000,1000

--Now look at the actual execution plan, it should be the same

--Right click on the execution plan

--Now click on 'Show Execution Plan XML...'

--A new tab will open called 'Execution Plan.xml'

--In here look for the following:

<ParameterList>

<ColumnReference Column="@p2" ParameterCompiledValue="(589)" ParameterRuntimeValue="(1000)" />

<ColumnReference Column="@p1" ParameterCompiledValue="(500000)" ParameterRuntimeValue="(1670000)" />

</ParameterList>

I hope that the examples above give the reader a better understanding of what may be occurring in their SQL Server implementation, and another tool (XML Showplan) to determine differences  in estimated/actual number of rows retrieved as well as pointing out the ability to determine compiled vs. runtime paramater values.




Win a signed PowerPivot Architecture Technical Diagram at TechEd NA 2010!
Published: Sun, 06 Jun 2010 08:09:00 GMT


 image

There are many reasons to go to TechEd North America 2010 in New Orleans this year.  After all, it is THE event for developers and IT Professionals.  And if you want to know more about PowerPivot, here’s yet another reason why you need to go.

 

 

There will be a raffle to win a signed copy of the PowerPivot Architecture Technical Diagram (screenshot below) in each of the five PowerPivot sessions (five in total). The sessions are:

  • BIE04-INT - Building Custom Extensions to the PowerPivot Management Dashboard (6/7 1:00pm-2:15pm, Rm 241)
  • BIC06-INT - SQLCAT: PowerPivot Best Practices and Enterprise Case Studies (6/7 4:30pm-5:45pm, Rm 241)
  • BIE22-INT - Tips and Tricks for Troubleshooting a Microsoft SQL Server PowerPivot for SharePoint Installation (6/9 8:00am-9:15am, Rm 241)
  • BIE401 - Deep Dive on PowerPivot Technologies (6/9 1:30pm-2:45pm, Rm 238)
  • BIE303 - Deploying and Managing PowerPivot for SharePoint 2010 (6/10 9:45am-11:00am, Rm 252)

Just attend one of these sessions our very own PowerPivotGeek Dave Wickert will enter you into a raffle for a full size 44” x 34” signed copy of our PowerPivot Client/Server Architecture poster.

 

image

 

Brought to you by SQLCAT, Analysis Services Team, and our friends at SQL Marketing!

 

---

Cross posting on: sqlcat.com, SQLCAT blogs, PowerPivot blog, PowerPivotGeek.com, PowerPivotTwins.com, and dennyglee.com




SQL Azure Customer Best Practices
Published: Fri, 28 May 2010 04:46:36 GMT


The SQLCAT team and the SQL Azure team have been working closely together with a number of customers even before we launched our CTP (Community Technology Preview) in November, 2009 at the PDC (Professional Developers Conference).  After our production release in January, 2010, we’ve continued working with some interesting customers and have captured a number of great learnings in these best practice documents.  We’ve posted 2 documents to date and have a few more planned over the next few weeks.  These best practice documents are being posted to the SQL Azure download center and the links are here:

Please check back here for more updates!




Now that SQL Server 2008 R2 has RTM'ed...
Published: Fri, 14 May 2010 06:59:00 GMT


We have been pretty excited about the last few weeks where there has been a lot of buzz about SQL Server 2008 R2 RTM.  As noted in the SQL Server Data Platform Insider blog posting, this release has a lot of interesting features and is a major release.  We have spent a lot of time with many TAP customers and many of the case studies mentioned in the blog are SQLCAT customers including the following case studies: 

Ø  Bank of New Zealand,

Ø  Studio Moderna

Ø  CareGroup Healthcare System

 

So what's next for the SQLCAT and SQL Server 2008 R2?

  • Regularly visit sqlcat.com as we will continue to publish our technical notes and whitepapers on our deep technical learnings (note we had already published the PowerPivot Technical Diagram: PowerPivot Client/Server Architecture
  • Attend the joint TechEd North America / MS BI Conference in New Orleans June 7-10, 2010 - with our Ted Kummert as the key note speaker!  A number of us will be attending and presenting at this year's joint conference.
  • Plan to attend PASS 2010 North America in Seattle November 8-11,2010.  We will be continuing our SQLCAT track of sessions as well as "CSS and SQLCAT First Aid clinic" to answer all your questions.
  • You can also follow us on twitter via @sqlcat (though we typically do not answer questions via twitter due to the sheer volume of requests).

Enjoy!

 

  •  



Resolving DTC Related Waits and Tuning Scalability of DTC
Published: Tue, 11 May 2010 18:31:00 GMT


Author: Mike Ruthruff

Contributors: Gert Drapers, Fabricio Voznika

Reviewers: Prem Mehra, Jimmy May, Kun Cheng

During a recent performance lab we encountered a large number of waits related to the use of distributed transactions.  The specific application being tested used COM+ components and made heavy use of DTC transactions managed by MSDTC.  Each component was marked as either “requires a transaction” or “supports a transaction”.  This resulted in all of the database calls through this layer enlisting in a DTC transaction even when SQL Server was the only Resource Manager (RM) participating in the transaction.   

When the application was run at scale we noticed the following exposed by sys.dm_os_wait_stats.  These three DTC related waits had very high average wait times and were at least one order of magnitude higher than the other waits on the system.

wait_type

 

total_wait_time_ms

 

total_waiting_tasks_count

 

average_wait_ms

 

DTC_STATE

 

5,477,997,934

 

4,523,019

 

1,211

 

PREEMPTIVE_TRANSIMPORT

 

2,852,073,282

 

3,672,147

 

776

 

PREEMPTIVE_DTC_ENLIST

 

2,718,413,458

 

3,670,307

 

740

 

  

Each of the waits above is related to management of distributed transactions within the SQL Server engine. PREEMPTIVE_TRANSIMPORT &  PREEMPTIVE_DTC_ENLIST are expected, and since these are PREEMPTIVE_ the wait time reflects the time inside the call and is recorded for each call.   So the average time to import a transaction is 776ms and to enlist in to the transaction is 740ms.  The code path that tracks both of these also tracks the DTC_STATE waits.  So any waits within them will have an implicit wait on DTC_STATE. 

By default, the application servers used the local MSDTC coordinator to manage the transactions.  This requires RPC communication between the SQL Server and the remote coordinator which can introduce significant overhead under high transactional load.  The diagram below illustrates the initial configuration before changing to utilize a remote coordinator. 

 1

To resolve these waits we made the following configuration changes.  Instead of relying on the local MSDTC coordinator the application servers were configured to utilize a remote coordinator which resides on the database server.  This removes the need for RPC communication between the database server and application server to manage the DTC transactions and is the primary reason this resulted in such an improvement.  A secondary benefit of this configuration change was that it resulted in using MSDTC which is part of Windows 2008.  MSDTC in Windows 2008 has significant improvements over Windows 2003.  The database server was running SQL Server 2008 SP1 on Windows 2008 R2.

The following diagram illustrates the configuration after making the change to use a remote a coordinator.

2

After making this change the DTC related waits became a non-issue and did not show up in the top 10 list of wait types. Keep in mind that even if scaling issues are not encounted in benchmark testing we have measured the overhead of DTC in previous benchmarks and found that 45% performance hit for starting a distributed transaction.

Some of the considerations and tuning options for applications that utilize DTC are the following:

  1. Only use DTC transactions when they are needed, meaning only when there is more than one resource manager involved.  Every SQL Server instance will be considered a separate resource manager.
  2. Configure the application servers to utilize a remote DTC coordinator residing on the SQL Server instead of using the local MSDTC coordinator.  In our case this provides significant benefit.
  3. Use System.Transactions within .Net to manage the transactions.  System.Transactions has the intelligence to only promote transactions to distributed transactions when it detects that there are two or more Resource Manager participating in the transaction.  If there is only a single RM it will not use MSDTC.   See the following link for more information.  http://msdn.microsoft.com/en-us/library/ms973865.aspx#introsystemtransact_topic5
  4. In this specific case there could have been benefit provided by using SQL Server Service Broker (SSB) ins place of MSMQ.  Because SSB lives within the same instance of SQL Server there would never be a need to use DTC coordination since there would only ever be one RM.

Configuring DTC to use a remote coordinator 

 

Here are the steps to setup a remote DTC instance.

 

Start the Component Services add-in using Run comexp.msc

 

Right click on the My Computer node underneath Component Services->Computer

3 

Choose Properties->MS DTC tab

4

Change the default from use local coordinator to a remote one. This way you can have the middle tier server and the database server use a single one by pointing them at the same MSDTC instance. This change might increase the number of concurrent transaction inside MSDTC which will likely require you to extend the log size from the default size.

The basic MS-DTC log sizing rules are: 

1.       Rules:

·         Log Record ~144 bytes (depends on number of RM’s involved in transaction)

·         MS-DTC uses a circular transaction log and @ 80%  full, it starts to throttle

·         Default Log Size is 4MB

·         DTC will start rejecting new transactions when the current log space used is larger than the (Total Log Size) / 8

2.       Determine if this log size will support the number of concurrent transactions expected on the system.  For example:

·         4MB / 144 bytes = 30,000 transactions

·         30,000 / 8 = ~ 3,640 concurrent transactions

Increased Contention on LOCK_HASH Spinlock When Using Distributed Transactions

In addition to the above we also observed another performance related impact as a result of the heavy use of DTC.  Monitoring sys.dm_os_spinlock_stats we noticed a high number of spins and collisions on the LOCK_HASH spinlock.  

By using Extended Events in SQL Server 2008 we were able to determine the increased contention was related to heavy use of DTC.  This further reinforced the strategy of reducing the use of distributed transactions to only when they are needed (i.e. when there really is more than one Resource Manager involved).  The waits on LOCK_HASH were higher than any others by 3 orders of magnitude. 

 

lock_name

total_spins

total_collisions

spins_collisions_average

LOCK_HASH

6,804,856,030,541

97,191,819

70,014

BUF_FREE_LIST

8,645,825,749

608,557

14,207

LOGCACHE_ACCESS

136,048,559

936,664

145

SOS_OBJECT_STORE

92,629,629

609,215

152

DBTABLE

19,014,033

104,771

181

 

By using Extended Events (script below) and tracking the "backoff" event it is possible to capture the stack trace for the most common operation trying to obtain the spinlock.  Capturing stack traces for these provide the ability to tell what type of operation is contributing to the contention for any particular lock.  We found the root cause of the contention was caused by the fact that the shared database lock taken by the session must be transferred any time an enlistment into a DTC transaction takes place. This type of lock requires the spinlock to be held which it is transitioned to the transaction workspace.

The impact of this problem is that it may increase CPU consumption under high concurrency.  Reducing use of DTC will implicitly reduce this contention. 

Here is the technique used to determine the cause for the spinlock contention:

Step 1 – Obtain debug symbols for sqlservr.exe and place those in the same directory as sqlservr.exe. 

In order to see the call stacks for the backoff events you will need to have symbols for the particular version of SQL Server that you are running.  There are public symbols available for this provided through the Microsoft Symbol Server.   Complete instructions on how to download symbols can be found in this article.

Step 2 – Use Extended Events to trace the backoff events for the LOCK_HASH spinlock

The following Extended Events script will trace backoff events for the LOCK_HASH spinlock and using a bucketizer target will group identical stacks into buckets with an associated count.  The buckets with the highest counts are the threads with the highest amount of contention on the spinlock. 

Here is the script:

/*

How to trace spinlock backoffs

**************************************/

--Find the LOCK_HASH type

--The type values change from build to build so it is necessary

--to do each time a version change to SQL has been made.

select * from sys.dm_xe_map_values where map_value like '%lock%'

 

--create the even session that will capture the callstacks to a bucketizer

create event session lock_hash_spin on server

      add event sqlos.spinlock_backoff (action (package0.callstack)

where type=59) --59 = LOCK_HASH

      add target package0.asynchronous_bucketizer (

            set filtering_event_name='sqlos.spinlock_backoff',

            source_type=1, source='package0.callstack')           

      with (MAX_MEMORY=256MB, MEMORY_PARTITION_MODE = PER_CPU)

 

--start the event session and wait for some time to track the back off events

alter event session lock_hash_spin on server state=start

waitfor delay '00:01:00'

 

--Trace flag 3656 enables the call stacks to be resolved.  This requires that the

--sqlservr.pdb file reside in the same directory as sqlservr.exe

DBCC traceon (3656, -1) 

 

--To view the data run :

select event_session_address, target_name, execution_count, CAST(target_data as XML)

from sys.dm_xe_session_targets

 

--stop the event tracing session and drop the event session

alter event session lock_hash_spin on server state=stop

drop event session lock_hash_spin on server

 

After running the above script and tracing the backoff events we can see the most common callstack(s) encountered.  In our case the following was the call stack with the highest count.   As you can see this is related to enlistments into DTC transactions.  

Associated callstack for the spinlock backoff event:

 

GenericEvent::CallNextAction+45

XeSosPkg::spinlock_backoff::Publish+68

SpinlockBase::Sleep+a616b4 [ @ 0+0x0

SpinlockBase::Backoff+60

Spinlock<59,7,1>::SpinToAcquireWithExponentialBackoff+127

lck_lockInternal+c07

XactWorkspaceImp::TransferDBLock+25a

CDbLockList::UpdateXactWorkspace+4a

SEParams::UpdateXactWorkspace+7b

CMsqlXactManager::FChangeXactWorkspace+ee

CMsqlXactManager::DefectXactWorkspace+33

CMsqlXactImp::EnlistDTCXact+a5

CResMgrTopExecLevel::Execute+441

process_request+370

SOS_Task::Param::Execute+11b

Special thanks to Gert Drapers (former SQLCAT team member) and Fabricio Voznika of the SQLOS development team for their guidance on the techniques in optimizing and isolating the DTC related performance problems described above.

 




      
 
PASS Official Chapter 
 

PRPASS is a Ineta Latam Member
 
 
© 2006 Puerto Rico PASS. All rights reserved.
Designed for resolution 1024x768 or higher