Tag Archives | TSQL

Tracing Deadlock Graphs: Extended Events or Server Side Trace

deadlock-manager-sql-serverDeadlock graphs are incredibly helpful for figuring out why queries are getting automatically killed off by SQL Server.

But it can be tricky to trace deadlocks in SQL Server. You might be tempted to use Profiler, but the application has lots of baggage. You’re better off using either Extended Events (which may be confusing), or a Server Side Trace (which you can script from Profiler– if you know exactly which buttons to push in the right order). You might look in the System Health trace on recent versions of SQL Server, but if the deadlock didn’t happen recently, then it may have rolled off the trace.

To make this all easier, I’ve created a gist on GitHub sharing TSQL to save you a bunch of time.

Choose the script that works for you. You can:

  1. Use a simple Extended Events trace to get deadlock graphs via the sqlserver.xml_deadlock_report event
  2. Use a Server Side SQL Trace to get deadlock graphs (for older versions of SQL Server, or people who like SQL Trace)
  3. Use a (much more verbose) Extended Events trace to get errors, completed statements, and deadlock graphs. You only need something like this if the input buffer showing in the deadlock graph isn’t enough, and you need to collect the other statements involved in the transactions. You do this by matching the transaction id for statements to the xactid for each item in the Blocked Process Report. Warning, this can generate a lot of events and slow performance.

View or download the code from GitHub, or get it below.

When testing out these deadlock traces, you might want code to reproduce a deadlock in SQL Server.

/***********************************************************************
Copyright 2016, Kendra Little - LittleKendra.com
MIT License, http://www.opensource.org/licenses/mit-license.php
***********************************************************************/
/***********************************************************************
COLLECT DEADLOCK GRAPHS WITH EXTENDED EVENTS
This script contains TSQL to:
* Create an Extended Events Trace collecting sqlserver.xml_deadlock_report
* Start the trace
* Code to stop and delete the trace is commented out
Notes:
This works with SQL Server 2012 and higher
Change the filename to a relevant location on the server itself
Tweak options in the WITH clause to your preference
Note that there is no automatic stop for this! If you want that, use a
Server Side SQL Trace instead.
THIS CREATES AND STARTS AN EXTENDED EVENTS TRACE
***********************************************************************/
/* Create the Extended Events trace */
CREATE EVENT SESSION [Deadlock Report] ON SERVER
ADD EVENT sqlserver.xml_deadlock_report
ADD TARGET package0.event_file
(SET filename=
N'S:\XEvents\deadlock-report.xel', max_file_size=(1024),max_rollover_files=(4))
/* File size is in MB */
WITH (
MAX_MEMORY=4096 KB,
EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
MAX_DISPATCH_LATENCY=30 SECONDS /* 0 = unlimited */,
MAX_EVENT_SIZE=0 KB,
MEMORY_PARTITION_MODE=NONE,
TRACK_CAUSALITY=OFF,
STARTUP_STATE=ON)
GO
/* Start the Extended Events trace */
ALTER EVENT SESSION [Deadlock Report]
ON SERVER
STATE = START;
GO
/***********************************************************************
Test a deadlock with the code here:
https://www.littlekendra.com/2016/09/13/deadlock-code-for-the-wideworldimporters-sample-database/
***********************************************************************/
/* Stop the Extended Events trace when you want with a command like this */
--ALTER EVENT SESSION [Deadlock Report]
-- ON SERVER
-- STATE = STOP;
--GO
/* Drop the trace when you're done with a command like this */
--DROP EVENT SESSION [Deadlock Report] ON SERVER;
--GO
/***********************************************************************
THIS OPTION USES A SERVER SIDE SQL TRACE TO PICK UP THE BPR.
YOU ONLY NEED THIS *OR* ONE OF THE XEVENTS TRACES
***********************************************************************/
/* Modified from a script generated from SQL Server Profiler */
/* Pre-requisites and notes:
This works with SQL Server 2005 and higher
Change the filename to a relevant location on the server itself
Tweak options to your preference (including the end date)
Code to stop and delete the trace is commented out at the end
THIS CREATES AND STARTS A SERVER SIDE SQL TRACE
*/
declare @rc int;
declare @TraceID int;
declare @maxfilesizeMB bigint;
declare @TraceEndDateTime datetime;
declare @TraceFilename nvarchar(500);
declare @rolloverfilecount int;
set @TraceEndDateTime = '2020-12-12 00:00:00.000';
set @maxfilesizeMB = 1024;
set @TraceFilename = N'S:\XEvents\Deadlock-Graph';
set @rolloverfilecount = 4;
/* Create the basic server side trace */
exec @rc = sp_trace_create
@TraceID output,
@options = 2 /* trace will use rollover files */,
@tracefile = @TraceFilename,
@maxfilesize = @maxfilesizeMB,
@stoptime = @TraceEndDateTime,
@filecount = @rolloverfilecount;
if (@rc != 0) goto error;
/* Add the deadlock graph event and collect some columns */
declare @on bit
set @on = 1
exec sp_trace_setevent @TraceID, 148, 1, @on
exec sp_trace_setevent @TraceID, 148, 12, @on
exec sp_trace_setevent @TraceID, 148, 14, @on
/* Start the trace */
exec sp_trace_setstatus @TraceID, 1
/* Return list of traces to the caller */
select * from sys.traces;
GOTO finish
error:
select [email protected]
finish:
GO
/***********************************************************************
Test a deadlock with the code here:
https://www.littlekendra.com/2016/09/13/deadlock-code-for-the-wideworldimporters-sample-database/
***********************************************************************/
/***********************************************************************
STOP AND DELETE THE Deadlock-Graph Trace (when needed)
***********************************************************************/
--/* Get the trace id */
--DECLARE @id int
--SELECT @id=id
--FROM sys.traces where path like '%Deadlock-Graph%';
--/* Stop the trace by setting its status to 0 */
--EXEC sp_trace_setstatus @traceid = @id , @status = 0;
--/* Delete the trace by setting the status to 2 */
--EXEC sp_trace_setstatus @traceid = @id , @status = 2;
--/* Return remaining traces */
--SELECT * FROM sys.traces;
--GO
/***********************************************************************
COLLECT ERRORS, STATEMENTS, AND DEADLOCK GRAPHS WITH EXTENDED EVENTS
VERY VERBOSE
This script contains TSQL to:
* Create an Extended Events Trace collecting
errors, completed statements, and deadlock report
* Trace collects transaction_id on statement completed events, this
can be matched to xactid for each item in the Blocked Process Report
* Start the trace
* Code to stop and delete the trace is commented out at the end
Notes:
This works with SQL Server 2012 and higher
*** This will generate a LOT of data on most servers and may slow performance
due to collecting statement completed events. If in doubt, start
with the alternate script collecting sqlserver.xml_deadlock_report only. ***
Change the filename to a relevant location on the server itself
Tweak options in the WITH clause to your preference
Note that there is no automatic stop for this! If you want that, use a
Server Side SQL Trace instead.
THIS CREATES AND STARTS AN EXTENDED EVENTS TRACE
***********************************************************************/
/* Create the Extended Events trace */
CREATE EVENT SESSION [Errors + Statements + Deadlock Report] ON SERVER
ADD EVENT sqlserver.error_reported(
ACTION(sqlserver.sql_text,sqlserver.transaction_id)),
ADD EVENT sqlserver.sp_statement_completed(SET collect_statement=(1)
ACTION(sqlserver.sql_text,sqlserver.transaction_id)),
ADD EVENT sqlserver.sql_statement_completed(
ACTION(sqlserver.sql_text,sqlserver.transaction_id)),
ADD EVENT sqlserver.xml_deadlock_report
ADD TARGET package0.event_file
(SET filename=
N'S:\XEvents\errors-statements-deadlock-report.xel', max_file_size=(1024),max_rollover_files=(4))
/* File size is in MB */
WITH (
MAX_MEMORY=4096 KB,
EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
MAX_DISPATCH_LATENCY=30 SECONDS /* 0 = unlimited */,
MAX_EVENT_SIZE=0 KB,
MEMORY_PARTITION_MODE=NONE,
TRACK_CAUSALITY=OFF,
STARTUP_STATE=ON)
GO
/* Start the Extended Events trace */
ALTER EVENT SESSION [Errors + Statements + Deadlock Report]
ON SERVER
STATE = START;
GO
/***********************************************************************
Test a deadlock with the code here:
https://www.littlekendra.com/2016/09/13/deadlock-code-for-the-wideworldimporters-sample-database/
***********************************************************************/
/* Stop the Extended Events trace when you want with a command like this */
--ALTER EVENT SESSION [Errors + Statements + Deadlock Report]
-- ON SERVER
-- STATE = STOP;
--GO
/* Drop the trace when you're done with a command like this */
--DROP EVENT SESSION [Errors + Statements + Deadlock Report] ON SERVER;
--GO

Testing an Insert for Race Conditions with Ostress.exe

test-for-race-conditions-rectangle

Whenever we have multiple sessions modifying data, things get tricky. When we have a pattern of “check if the data exists and then do a thing,” multiple sessions get even more complicated.

It’s not very practical to try to open a bunch of sessions in SQL Server Management Studio to run commands in a loop. It’s hard to manage and if you’re like me, you find a way to crash SSMS doing it.

It’s much easier to test for race conditions using the free OStress.exe tool from Microsoft. Download OStress.exe as part of the RML Utilities for SQL Server (x64) here.

Our Example: Insert a Row if It Doesn’t Exist Already

We have a simple table named dbo.RaceCondition with two columns: RaceConditionId and RaceConditionValue. RaceConditionId is the clustered Primary Key. A unique index on RaceConditionValue prevents duplicate rows from being inserted.

Here’s how to create it:

IF DB_ID('RaceConditionTesting') is not null
BEGIN
    USE MASTER;
    ALTER DATABASE RaceConditionTesting SET SINGLE_USER WITH ROLLBACK IMMEDIATE
    DROP DATABASE RaceConditionTesting;
END
GO

CREATE DATABASE RaceConditionTesting;
GO

use RaceConditionTesting;
GO

CREATE TABLE dbo.RaceConditionTable
(
    RaceConditionId int IDENTITY(1,1) NOT NULL,
    RaceConditionValue varchar(50) NOT NULL,
    CONSTRAINT pk_RaceConditionTable PRIMARY KEY CLUSTERED (RaceConditionId ASC)
);
GO

CREATE UNIQUE INDEX ix_RaceConditionTable_RaceConditionValue on dbo.RaceConditionTable(RaceConditionValue);
GO

/* Insert one row */
INSERT dbo.RaceConditionTable (RaceConditionValue) VALUES ('Foo');
GO

Our First Attempt: A Single Select Statement

Here’s the code to insert rows into dbo.RaceConditionTable. dbo.InsertIfNotExists contains just one INSERT/SELECT statement.

Note that this is all one statement. There’s no “IF” statement. There’s also no hints.

IF OBJECT_ID('dbo.InsertIfNotExists') IS NULL
    EXEC ('CREATE PROCEDURE dbo.InsertIfNotExists as RETURN 0;');
GO
ALTER PROCEDURE dbo.InsertIfNotExists (
    @RaceConditionValue varchar(50)
)
AS
    SET NOCOUNT ON;

    INSERT dbo.RaceConditionTable (RaceConditionValue) 
    SELECT x.newval
    FROM (SELECT @RaceConditionValue as newval) as x
    LEFT JOIN dbo.RaceConditionTable as p on 
        x.newval=p.RaceConditionValue
    WHERE p.RaceConditionValue IS NULL;
GO

Our Simple Test Harness: dbo.RaceConditionTest and Ostress.exe

Now we need a way to test this.  dbo.RaceConditionTest is a procedure that looks at RaceCondition table, picks the highest value for RaceConditionId in the table, adds one to it and turns that into a character value for RaceConditionValue.

We are going to run this procedure in OStress.exe across multiple threads. Those threads will all be generating values for RaceConditionValue using the same table. Will the simple INSERT/SELECT statement use the right locks to prevent them from generating the same values? If so, they’ll fail, because our unique index does not allow duplicate values.

IF OBJECT_ID('dbo.RaceConditionTest') IS NULL
    EXEC ('CREATE PROCEDURE dbo.RaceConditionTest as RETURN 0;');
GO
ALTER PROCEDURE dbo.RaceConditionTest 
AS
    DECLARE @namevarchar varchar(50)
    SELECT @namevarchar = CAST(MAX(RaceConditionId)+1 AS varchar(50))
    FROM dbo.RaceConditionTable

    EXEC dbo.InsertIfNotExists @[email protected];
GO

I’ll run OStress.exe with this command:

ostress.exe -Q"exec RaceConditionTesting.dbo.RaceConditionTest" -n6 -r100 -o"c:\ostressoutput"

This tells OStress to run the RaceConditionTest procedure on six threads, try to run it 100 times on each thread, and to put output files in c:\ostressoutput.

Test Results: The Simple Insert/Select

OStress.exe spews insert failures everywhere with our first INSERT statement. The message “Cannot insert duplicate key row in object ‘dbo.RaceConditionTable’ with unique index ‘ix_RaceConditionTable_RaceConditionValue'” is everywhere:

ostress-insert-select-duplicate-key

Here’s the insert command we’ve been using, to review:

    INSERT dbo.RaceConditionTable (RaceConditionValue) 
    SELECT x.newval
    FROM (SELECT @RaceConditionValue as newval) as x
    LEFT JOIN dbo.RaceConditionTable as p on 
        x.newval=p.RaceConditionValue
    WHERE p.RaceConditionValue IS NULL;

The OStress.exe results show that the locks on the LEFT JOIN to race condition to check if a row exists can be processed and released before the INSERT runs. This means that the following can happen:

  • Session A sees that key 266 does not exist and prepares to insert
  • Session B sees that key 266 does not exist and prepares to insert
  • Session A runs its  insert
  • Session B attempts to run its insert, but fails because of a duplicate key error

Revised Code: Higher Locks with UPDLOCK

What if we tell SQL Server to take out a higher lock when it reads the row? If we block when we read, maybe that will solve the problem by making others wait. Let’s test it. Here’s our revised code:

IF OBJECT_ID('dbo.InsertIfNotExists') IS NULL
    EXEC ('CREATE PROCEDURE dbo.InsertIfNotExists as RETURN 0;');
GO
ALTER PROCEDURE dbo.InsertIfNotExists (
    @RaceConditionValue varchar(50)
)
AS
    SET NOCOUNT ON;

    INSERT dbo.RaceConditionTable (RaceConditionValue) 
    SELECT x.newval
    FROM (SELECT @RaceConditionValue as newval) as x
    LEFT JOIN dbo.RaceConditionTable as p WITH (UPDLOCK) on 
        x.newval=p.RaceConditionValue
    WHERE p.RaceConditionValue IS NULL;
GO

The only change here is that the LEFT JOIN now has a hint telling SQL Server to take out an update lock (UPDLOCK) when it reads the row.

Test Results: UPDLOCK

Re-running OStress.exe, I can see that I haven’t fixed the issue. I’ve still got unique key violations galore:

ostress-insert-select-updlock-duplicate-key

Even though I’m taking out UPDLOCKS, the following race condition pattern can still occur

  • Session A takes out an UPDLOCK, sees that key 266 does not exist, releases its UPDLOCK, and prepares to insert
  • Session B takes out an UPDLOCK, sees that key 266 does not exist, releases its UPDLOCK, and prepares to insert
  • Session A runs its  insert
  • Session B attempts to run its insert, but fails because of a duplicate key error

We need to hold that lock.

More Revised Code: Higher Locks with UPDLOCK, Lock Duration with HOLDLOCK

We can protect against race conditions like this by telling SQL Server to take out more locks when it processes the existence check and to hold them for the duration of the transaction. Here’s the revised code:

IF OBJECT_ID('dbo.InsertIfNotExists') IS NULL
    EXEC ('CREATE PROCEDURE dbo.InsertIfNotExists as RETURN 0;');
GO
ALTER PROCEDURE dbo.InsertIfNotExists (
    @RaceConditionValue varchar(50)
)
AS
    SET NOCOUNT ON;

    INSERT dbo.RaceConditionTable (RaceConditionValue) 
    SELECT x.newval
    FROM (SELECT @RaceConditionValue as newval) as x
    LEFT JOIN dbo.RaceConditionTable as p WITH (UPDLOCK, HOLDLOCK) on 
        x.newval=p.RaceConditionValue
    WHERE p.RaceConditionValue IS NULL;
GO

Now the LEFT JOIN now has a hint telling SQL Server is taking out an (UPDLOCK) when it reads the row AND to protect that key range for the duration of the transaction (HOLDLOCK). So we’re using Serializable Isolation level and telling it to lock it like we’re modifying, no matter what.

Test Results: UPDLOCK, HOLDLOCK

Re-running OStress.exe, I get nice clean output this time:

ostress-insert-select-updlock-holdlock

No key insert violations! I get the same thing running across 20 threads, 1,000 executions per thread.

There Are Downsides to Taking Out High Locks and Holding Them Longer

Higher locks that are held longer can mean more blocking, and of course that can slow you down. If new rows come in relatively rarely, I probably want to make my code a bit more complex and only take out the higher locks when I really have to. I also want to make sure my indexes are optimized to help me read and lock the smallest range possible for speed. Check out a sample of that scenario in my prior post.

 

The Case of the Blocking Merge Statement (LCK_M_RS_U locks)

merge-troubles-rectangle

Recently I got a fun question about an “upsert” pattern as a “Dear SQL DBA” question. The question is about TSQL, so it lent itself to being answered in a blog post where I can show repro code and screenshots.

Here’s the Scenario in the Anonymized Question

We have a lookup table, which is the parent table in a foreign key relationship with a child table. The child table has lots more rows. When data comes in, we need to:

  • Check the Parent Table (the lookup table) to see if the ParentValue is already present.
  • If ParentValue is not present, insert a row for the new ParentValue into the Parent Table. This happens very rarely, but we need to check for it.
  • Then insert a row into the Child Table, using the ParentId.

The Problem: Blocking against the Parent Table

When run under a few sessions, locking and blocking issues were creeping up fast. A merge statement was being used to check for the values in the Parent Table and insert them when they weren’t present.

Let’s Look At the Code

This creates a database named upsert_test, then creates ParentTable and ChildTable objects in the database. Finally, a Foreign Key is created on ChildTable, referencing the ParentTable.

IF DB_ID('upsert_test') is not null
BEGIN
    USE MASTER;
    ALTER DATABASE upsert_test SET SINGLE_USER WITH ROLLBACK IMMEDIATE
    DROP DATABASE upsert_test;
END
GO

CREATE DATABASE upsert_test;
GO

use upsert_test;
GO

CREATE TABLE dbo.ParentTable
(
    ParentId int IDENTITY(1,1) NOT NULL,
    ParentValue varchar(50) NOT NULL,
    CONSTRAINT pk_ParentTable PRIMARY KEY CLUSTERED (ParentId ASC)
);
GO

CREATE TABLE dbo.ChildTable
(
    ChildId INT IDENTITY(1,1) NOT NULL,
    ParentId INT NOT NULL,
    ChildValue VARCHAR(50) NOT NULL,
    CreatedDate DATE NOT NULL CONSTRAINT DF_Work_created DEFAULT (getdate()),
    CONSTRAINT PK_Work PRIMARY KEY CLUSTERED (ChildId ASC)
);
GO

ALTER TABLE ChildTable ADD CONSTRAINT FK_Work_Source FOREIGN KEY (ParentId) REFERENCES ParentTable (ParentId);
GO

Here’s the Upsert (aka Merge Query)

A stored procedure is used to handle incoming values.  It uses MERGE to look for matching rows in ParentTable, and insert when not matched.

IF OBJECT_ID('dbo.DoStuff') IS NULL
    EXEC ('CREATE PROCEDURE dbo.DoStuff as RETURN 0;');
GO

ALTER PROCEDURE dbo.DoStuff (
    @ParentValue varchar(50),
    @ChildValue varchar(50)
)
AS

    MERGE ParentTable with (HOLDLOCK) as p
    USING (SELECT @ParentValue NewParentValue) as new
        ON p.ParentValue = new.NewParentValue
    WHEN NOT MATCHED THEN
    INSERT (ParentValue) VALUES (new.NewParentValue);

    INSERT INTO ChildTable (ParentId, ChildValue)
    SELECT p.ParentId, @ChildValue
    FROM ParentTable p
    WHERE [email protected];
GO

Why is that HOLDLOCK Hint in the Merge Query?

My reader quite rightly used this hint in their merge query. Although MERGE looks like a single query, it’s actually just “syntactic sugar”. Behind the scenes, merge can be implemented as a select and an insert in two separate commands. Developers are advised to use HOLDLOCK to avoid race conditions with MERGE.

I asked one clarifying question — was the lock wait type they were seeing “LCK_M_RS_U”?

It was.

This confirmed that HOLDLOCK and merge were slowing them down instead of helping them.

Let’s Populate Some Rows for Testing and Reproduce the Blocking

exec dbo.DoStuff @ParentValue='Stuff', @ChildValue='Things';
GO
exec dbo.DoStuff @ParentValue='MoreStuff', @ChildValue='MoreThings';
GO
exec dbo.DoStuff @ParentValue='EvenMoreStuff', @ChildValue='EvenMoreThings';
GO

exec dbo.DoStuff @ParentValue='EvenMoreStuff', @ChildValue='EvenMoreThings x 2';
GO
exec dbo.DoStuff @ParentValue='EvenMoreStuff', @ChildValue='EvenMoreThings x 3';
GO

/* Create 5000 more ParentValues */
SET NOCOUNT ON;
DECLARE @namevarchar varchar(50), @i int=1
BEGIN TRAN
    WHILE @i <= 5000
    BEGIN
        SET @namevarchar= cast(RAND() AS VARCHAR(50));
        EXEC dbo.DoStuff @[email protected], @ChildValue='Whatever';
        SET @[email protected]+1;
    END
COMMIT
GO 

To see the blocking issue, just run the following code in three session windows at the same time. Note that we’re running this over and over with the same ParentValue, and the ParentValue of “Stuff” is already in the table. This will not have to insert any rows into ParentTable.

SET NOCOUNT ON;
exec dbo.DoStuff @ParentValue='Stuff', @ChildValue='Things';
GO 1000000

Here’s what the blocking looks like in Adam Machanic’s sp_WhoIsActive:

Range Lock Waits

HOLDLOCK = Serializable Isolation Level = Key Range Locks

The holdlock hint is a way to get serializable isolation level in SQL Server for a specific table, without having to change the isolation level for your entire session. Serializable is the highest isolation level in SQL Server using pessimistic locking.

When you “HOLDLOCK”, you tell SQL Server to protect any rows you read with a range lock– just in case someone comes along and tries to change one or sneak  one in.

That means that even when you’re just reading ParentTable and not inserting a row, you’re taking out a key range lock. You’re willing to fight other users over those rows to protect your statement.

There’s two parts to getting around the blocking and making this faster…

Index The Parent Table (Solution Part 1)

Currently, the only index on ParentTable is on ParentId.

Even if ParentTable is tiny, if we’re frequently accessing the table and looking up a ParentValue, we’ll benefit from creating a nonclustered index on that column. We should also allow only unique values into ParentValue for data integrity reasons. A unique nonclustered index is exactly what we need:

CREATE UNIQUE NONCLUSTERED INDEX ix_ParentTable_ParentValue on dbo.ParentTable(ParentValue)
GO

In my simple three session test, this makes the merge statement very efficient, and performance goes way up. You can no longer catch those LCK_M_RS_U waits in sp_WhoIsActive. However, I’m still concerned about them, and would still…

Ditch the Merge (Solution Part 2)

The “merge” command in SQL Server is often a let-down for folks. The syntax is confusing, most folks find out about the race conditions/concurrency issues the hard way, and the biggest problem is that it often seems “better” than other TSQL options because it was introduced as an enhancement in SQL Server 2008… but it isn’t always the better choice.

In this case, ditching the merge gives me more granular control of when I want to use that high level lock on ParentTable. The code is longer mostly because of a lot of comments.

ALTER PROCEDURE dbo.DoStuff (
    @ParentValue varchar(50),
    @ChildValue varchar(50)
)
AS
    DECLARE @ParentId INT;

    /* ParentId is very rarely new, so check for it first with only a shared lock */
    SELECT @ParentId=ParentId
    FROM dbo.ParentTable
    WHERE [email protected]

    /* Insert the new value if we have to. */
    /* Use the SELECT WITH UPDLOCK in case of race conditions */
    /* Get the new ParentId so we don't have to rejoin back to the table */
    IF @ParentId IS NULL
    BEGIN
        DECLARE @OutputVal TABLE (ParentId INT)

        INSERT dbo.ParentTable (ParentValue) 
        OUTPUT inserted.ParentId INTO @OutputVal(ParentId)
        SELECT x.newval
        FROM (SELECT @ParentValue as newval) as x
        LEFT JOIN dbo.ParentTable as p WITH (UPDLOCK, HOLDLOCK) on 
            x.newval=p.ParentValue
        WHERE p.ParentValue IS NULL;

        /* We are only ever going to have one row in @OutputVal */
        SELECT @ParentId=ParentId
        FROM @OutputVal;

    END

    INSERT INTO dbo.ChildTable (ParentId, ChildValue)
    SELECT @ParentId, @ChildValue;
GO

In our scenario, it’s rare for new ParentValues to come in. So I’ve used a pattern to try to use as many shared locks against ParentTable as possible, stick with the Read Committed Isolation level, and still protect against race conditions. The pattern is essentially this:

  1. Check if ParentValue already exists (shared read locks only)
  2. If this is the rare case that a ParentValue does not exist…
    • Insert the row into ParentTable
    • Protect against race conditions by inserting with a SELECT joining to ParentTable with UPDLOCK, HOLDLOCK in case the same new row happens to come in on two sessions at the same time
    • Use the OUTPUT clause to get the new @ParentId so we don’t have to join to ParentTable again in the next step
  3. Insert into ChildTable

Rough Comparison: Do These Changes Help?

I didn’t do thorough load testing on this. I just ran the call to dbo.DoStuff above in three sessions in a 4 core VM on my MacBook and looked at the BatchRequests/sec performance counter in SQL Server. Here’s what I saw:

Setup Batch Requests/sec with No Nonclustered Index on dbo.Parent Batch Requests/sec with Nonclustered Index on dbo.Parent
Merge 313 avg  4000+
Insert with Left Join 495 avg  4000+

In this test case, adding the nonclustered index makes a bigger difference than changing the TSQL. But I would still move away from merge, because I want to be able to control when anything tougher than a read lock is being taken out against ParentTable — that’s very attractive since new values come in rarely in this case. The more concurrent sessions that are running this, the more that will help.

Don’t Forget to Handle Errors

Error handling is important! The code in this post doesn’t have it for simplicity reasons. If you need a starter guide for error handling, check out Erland Sommarskog’s excellent whitepaper.

Further Tuning Thoughts

This code can be tuned further, but I’d want to set up a really clean load test using application servers (not SSMS) against a proper instance (maybe NOT on a MacBook). I would look at:

  • Whether validating ParentValue could be done in memory in the client application tier. Avoiding constant data access against dbo.ParentTable is attractive if that’s possible.
  • Wait statistics during execution to point to the next thing to tune.

What Do You Think of Merge?

Do you love or hate merge statements in SQL Server?

How to Script Out Indexes from SQL Server

snapshot-indexes-littlekendra.com

Sometimes you need to script out all the indexes in a database.

Maybe you’re concerned something has changed since they were last checked in.

Or maybe the indexes aren’t checked into source control, and you’re working on fixing that. (Important!)

Either way, sometimes you need to do it, and it’s not fun through the GUI. I needed to write some fresh demo code for this recently, and I needed it to give the details for partitioned tables using data compression, and I thought I’d share.

Enjoy! If you’ve got ideas about ways to improve this, I’d love to hear them in the comments.

TSQL For Scripting Out All Indexes in a Database

This will script out:

  • Clustered and nonclustered indexes (including filtered indexes)
  • The filegroup the index is created on (if not partitioned)
  • The partition scheme the index is created on (if partitioned)
  • Compression settings if the index is compressed– and it’s per partition if the table is partitioned
  • For tables that don’t have a clustered index, they will still get a row in the results with a comment indicating it is a heap

The code is available in a gist (for easy copying and reference), and is also below.

SELECT
DB_NAME() AS database_name,
sc.name + N'.' + t.name AS table_name,
(SELECT MAX(user_reads)
FROM (VALUES (last_user_seek), (last_user_scan), (last_user_lookup)) AS value(user_reads)) AS last_user_read,
last_user_update,
CASE si.index_id WHEN 0 THEN N'/* No create statement (Heap) */'
ELSE
CASE is_primary_key WHEN 1 THEN
N'ALTER TABLE ' + QUOTENAME(sc.name) + N'.' + QUOTENAME(t.name) + N' ADD CONSTRAINT ' + QUOTENAME(si.name) + N' PRIMARY KEY ' +
CASE WHEN si.index_id > 1 THEN N'NON' ELSE N'' END + N'CLUSTERED '
ELSE N'CREATE ' +
CASE WHEN si.is_unique = 1 then N'UNIQUE ' ELSE N'' END +
CASE WHEN si.index_id > 1 THEN N'NON' ELSE N'' END + N'CLUSTERED ' +
N'INDEX ' + QUOTENAME(si.name) + N' ON ' + QUOTENAME(sc.name) + N'.' + QUOTENAME(t.name) + N' '
END +
/* key def */ N'(' + key_definition + N')' +
/* includes */ CASE WHEN include_definition IS NOT NULL THEN
N' INCLUDE (' + include_definition + N')'
ELSE N''
END +
/* filters */ CASE WHEN filter_definition IS NOT NULL THEN
N' WHERE ' + filter_definition ELSE N''
END +
/* with clause - compression goes here */
CASE WHEN row_compression_partition_list IS NOT NULL OR page_compression_partition_list IS NOT NULL
THEN N' WITH (' +
CASE WHEN row_compression_partition_list IS NOT NULL THEN
N'DATA_COMPRESSION = ROW ' + CASE WHEN psc.name IS NULL THEN N'' ELSE + N' ON PARTITIONS (' + row_compression_partition_list + N')' END
ELSE N'' END +
CASE WHEN row_compression_partition_list IS NOT NULL AND page_compression_partition_list IS NOT NULL THEN N', ' ELSE N'' END +
CASE WHEN page_compression_partition_list IS NOT NULL THEN
N'DATA_COMPRESSION = PAGE ' + CASE WHEN psc.name IS NULL THEN N'' ELSE + N' ON PARTITIONS (' + page_compression_partition_list + N')' END
ELSE N'' END
+ N')'
ELSE N''
END +
/* ON where? filegroup? partition scheme? */
' ON ' + CASE WHEN psc.name is null
THEN ISNULL(QUOTENAME(fg.name),N'')
ELSE psc.name + N' (' + partitioning_column.column_name + N')'
END
+ N';'
END AS index_create_statement,
si.index_id,
si.name AS index_name,
partition_sums.reserved_in_row_GB,
partition_sums.reserved_LOB_GB,
partition_sums.row_count,
stat.user_seeks,
stat.user_scans,
stat.user_lookups,
user_updates AS queries_that_modified,
partition_sums.partition_count,
si.allow_page_locks,
si.allow_row_locks,
si.is_hypothetical,
si.has_filter,
si.fill_factor,
si.is_unique,
ISNULL(pf.name, '/* Not partitioned */') AS partition_function,
ISNULL(psc.name, fg.name) AS partition_scheme_or_filegroup,
t.create_date AS table_created_date,
t.modify_date AS table_modify_date
FROM sys.indexes AS si
JOIN sys.tables AS t ON si.object_id=t.object_id
JOIN sys.schemas AS sc ON t.schema_id=sc.schema_id
LEFT JOIN sys.dm_db_index_usage_stats AS stat ON
stat.database_id = DB_ID()
and si.object_id=stat.object_id
and si.index_id=stat.index_id
LEFT JOIN sys.partition_schemes AS psc ON si.data_space_id=psc.data_space_id
LEFT JOIN sys.partition_functions AS pf ON psc.function_id=pf.function_id
LEFT JOIN sys.filegroups AS fg ON si.data_space_id=fg.data_space_id
/* Key list */ OUTER APPLY ( SELECT STUFF (
(SELECT N', ' + QUOTENAME(c.name) +
CASE ic.is_descending_key WHEN 1 then N' DESC' ELSE N'' END
FROM sys.index_columns AS ic
JOIN sys.columns AS c ON
ic.column_id=c.column_id
and ic.object_id=c.object_id
WHERE ic.object_id = si.object_id
and ic.index_id=si.index_id
and ic.key_ordinal > 0
ORDER BY ic.key_ordinal FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'),1,2,'')) AS keys ( key_definition )
/* Partitioning Ordinal */ OUTER APPLY (
SELECT MAX(QUOTENAME(c.name)) AS column_name
FROM sys.index_columns AS ic
JOIN sys.columns AS c ON
ic.column_id=c.column_id
and ic.object_id=c.object_id
WHERE ic.object_id = si.object_id
and ic.index_id=si.index_id
and ic.partition_ordinal = 1) AS partitioning_column
/* Include list */ OUTER APPLY ( SELECT STUFF (
(SELECT N', ' + QUOTENAME(c.name)
FROM sys.index_columns AS ic
JOIN sys.columns AS c ON
ic.column_id=c.column_id
and ic.object_id=c.object_id
WHERE ic.object_id = si.object_id
and ic.index_id=si.index_id
and ic.is_included_column = 1
ORDER BY c.name FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'),1,2,'')) AS includes ( include_definition )
/* Partitions */ OUTER APPLY (
SELECT
COUNT(*) AS partition_count,
CAST(SUM(ps.in_row_reserved_page_count)*8./1024./1024. AS NUMERIC(32,1)) AS reserved_in_row_GB,
CAST(SUM(ps.lob_reserved_page_count)*8./1024./1024. AS NUMERIC(32,1)) AS reserved_LOB_GB,
SUM(ps.row_count) AS row_count
FROM sys.partitions AS p
JOIN sys.dm_db_partition_stats AS ps ON
p.partition_id=ps.partition_id
WHERE p.object_id = si.object_id
and p.index_id=si.index_id
) AS partition_sums
/* row compression list by partition */ OUTER APPLY ( SELECT STUFF (
(SELECT N', ' + CAST(p.partition_number AS VARCHAR(32))
FROM sys.partitions AS p
WHERE p.object_id = si.object_id
and p.index_id=si.index_id
and p.data_compression = 1
ORDER BY p.partition_number FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'),1,2,'')) AS row_compression_clause ( row_compression_partition_list )
/* data compression list by partition */ OUTER APPLY ( SELECT STUFF (
(SELECT N', ' + CAST(p.partition_number AS VARCHAR(32))
FROM sys.partitions AS p
WHERE p.object_id = si.object_id
and p.index_id=si.index_id
and p.data_compression = 2
ORDER BY p.partition_number FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'),1,2,'')) AS page_compression_clause ( page_compression_partition_list )
WHERE
si.type IN (0,1,2) /* heap, clustered, nonclustered */
ORDER BY table_name, si.index_id
OPTION (RECOMPILE);
GO

Why ROWLOCK Hints Can Make Queries Slower and Blocking Worse in SQL Server

 

with-rowlick

Sometimes when SQL Server gets slow, developers and DBAs find that the problem is blocking. After lots of work to identify the query or queries which are the blockers, frequently one idea is to add ROWLOCK hints to the queries to solve the problem or to disable PAGE locks on the table. This often backfires – here’s why.

The Theory: ROWLOCK Makes Locks More Granular

The idea behind the change is that by forcing SQL Server to take out row locks instead of page locks, each individual lock will be smaller. If locks are smaller, queries will be less likely to block one another.

How It’s Implemented: Hints and ALTER INDEX SET ALLOW_PAGE_LOCKS=OFF

By default, the  SQL Server engine will select row or page based locks based on what it thinks is best. You can coerce the database engine into using row based locking in two ways: TSQL hints, or by disallowing page locks on an index.

The TSQL hint is a table variant. If you have joins, you can specify this per table, like this:

SELECT fnby.Gender, fnby.NameCount, fnby.ReportYear
FROM agg.FirstNameByYear AS fnby WITH (ROWLOCK)
JOIN ref.FirstName AS fn on fnby.FirstNameId=fn.FirstNameId
WHERE fn.FirstName='Shakira';
GO

If you can’t change the query, the TSQL hint can be applied using a plan guide. (That is, if you can get the plan guide to work. They’re squirrely.)

The ALTER INDEX variant looks like this, and impacts any query using that index:

ALTER INDEX pk_aggFirstNameByYear ON agg.FirstNameByYear SET ( ALLOW_PAGE_LOCKS = OFF );
GO

By disallowing page level locks, the only options left are ROW, TABLE, and perhaps partition level (more on that later).

One Problem: Forcing ROWLOCK Increases CPU Time and Duration if a Page Lock was Preferred

Here’s an Extended Events trace showing the difference in CPU Time and Duration for a SELECT query. The top 10 results are when the database engine chose the locking level (it selected page). The second 10 results have a ROWLOCK hint forcing row level locks against the agg.FirstNameByYear table.

Extended-Events-Trace-Compare-CPU-Duration-ROW-PAGE-Locks

Note that the logical reads are the exact same and neither query is doing physical reads (the execution plans are the same– the optimizer doesn’t care what locks you are using). The queries were run with SET STATISTICS IO,TIME OFF and Execution Plans turned off, just to reduce influencing factors on duration and CPU.

The database engine is simply having to do more work here. Locking the pages in the clustered index is less work than locking each of the 1,825,433 rows.

Even though our locks are more granular, making queries run longer by taking out individual locks will typically lead to more blocking down the road.

If you’d like to run the test yourself against the SQLIndexWorkbook database, here’s the code. (Hips don’t lie, Shakira Shakira)

/* PAGELOCK */
SELECT fnby.Gender, fnby.NameCount, fnby.ReportYear
FROM agg.FirstNameByYear AS fnby
JOIN ref.FirstName AS fn on fnby.FirstNameId=fn.FirstNameId
WHERE fn.FirstName='Shakira';
GO 10

/* ROWLOCk */
SELECT fnby.Gender, fnby.NameCount, fnby.ReportYear
FROM agg.FirstNameByYear AS fnby WITH (ROWLOCK)
JOIN ref.FirstName AS fn on fnby.FirstNameId=fn.FirstNameId
WHERE fn.FirstName='Shakira';
GO

 Another Problem: Lock Escalation

The SQL Server Engine is a little worried about managing lots of little locks. Those locks take up memory. When you hit 5,000 locks on a single table you pass the internal lock escalation threshold and the database engine will attempt to replace your little bitty locks with one larger table level lock.

That sounds great, except for this:

  • Lock escalation is from row to table or page to table (never from row to page)
  • If your query is making a modification, the escalation will be to an exclusive lock on the whole table

It’s quite possible to turn a periodic small blocking problem with page locks into a sporadic big blocking problem due to lock escalation.

Here’s an example against the SQLIndexWorkbook database. We’re updating NameCount for a given report year. Here’s our query starting out…  (Yep, we’re setting NameCount to itself. It’ll still take out the locks.)

UPDATE agg.FirstNameByYear 
    SET NameCount=NameCount
WHERE Gender='M';

SQL Server figured out that it could run this query with an intent exclusive (IX) lock on agg.FirstNameByYear and 4961 exclusive (X) PAGE locks on the clustered primary key. That’s not awesome – it’s most of the table.

But if I change the query to force ROWLOCKS like this, the problem does not get better.

UPDATE agg.FirstNameByYear 
    WITH (ROWLOCK)
    SET NameCount=NameCount
WHERE Gender='M';

Now I end up with SQL Server attempting to escalate locks after it passes 5,000 row locks, retrying the operation every 1,250 new locks. Here’s an example of Extended Events catching a lock_escalation event for this query, which promoted the locks to an exclusive table lock:

Lock-Escalation-Event

Workaround: Disable (or Change) Lock Escalation

You can control lock escalation at the table level using the ALTER TABLE command. In our example, we can do this:

ALTER TABLE agg.FirstNameByYear SET ( LOCK_ESCALATION = DISABLE);
GO

Now, our UPDATE query with the ROWLOCK hint gets 743,750 exclusive (X) KEY locks, 4,961 intent-exclusive (IX) PAGE locks, and one intent exclusive (IX) lock on the object.

In some cases, disabling lock escalation and forcing row locks may work well, depending on the access patterns of your table and how much is being locked. Memory is needed to maintain locks, but the memory for lock structures is very small compared with memory for buffer pool and other components on modern SQL Servers. If you’re concerned about memory use for locks, you can baseline it and test using performance counters:

SELECT object_name, counter_name, 
    cntr_value/1024. as memory_mb
FROM sys.dm_os_performance_counters
WHERE counter_name = 'Lock Memory (KB)';
GO

That being said, I would only disable lock escalation when needed in a controlled setting, when there weren’t better options.

Partition Level Escalation

If you’re using partitioned tables, lock escalation still defaults to TABLE. However, you can change lock escalation so it goes from ROW -> PARTITION or from PAGE -> PARTITION.

That’s done by running:

ALTER TABLE agg.FirstNameByYear SET ( LOCK_ESCALATION = AUTO);
GO

I know, you’d think ‘AUTO’ would be the default. It’s not, because you might run into partition level deadlocks. If you have large partitioned tables and lock escalation is an issue, partition level escalation is worth testing.

Workaround: Tune the TSQL

In our example, the obvious question is why we’re even running this query, since it’s updating a value to itself and not performing a meaningful change. If it was doing something meaningful, it’d be desirable to work on the TSQL to have it perform the modifications in smaller, more controlled batches. This minimizes the lock footprint.

This is typically doable for batch update jobs and behind the scenes processing.

Workaround: Use Optimistic Locking (Snapshot or Read Committed Snapshot) for SELECT Queries

One of my favorite tools to fight blocking is isolation levels. Frequently the blocked queries are predominantly read-only– they’re SELECT queries. Why should those queries have to wait for the update to finish?

SQL Server’s SNAPSHOT and READ COMMITTED SNAPSHOT isolation levels are great for these queries. They can get consistent, valid data regardless of whether a modification is running by using row versioning in tempdb. This is a longer term change, not a quick fix. For transactional OLTP databases, optimistic locking is one of the best tools out there.

Takeaways: Use ROWLOCK Hints Cautiously

I’m not against ROWLOCK hints, I’ve just seen them backfire. If you choose to force row locking as part of your solution, check to make sure it’s not slowing down your queries and keep a close eye on blocking and lock escalation on your tables.

How to Check if an Index Exists on a Table in SQL Server

counting-indexes-dragon

It seems like this should be easy. But it’s not.

Code Should be Rerunnable – So You Need to Check if Indexes Exist

Whenever you set up a script to create or drop an index, you want a safety check in there. Drop the index if it exists. Create the index if it doesn’t.

Or do something else programmatically. Checking if an index exists is a pretty frequent task. But there’s no simple function to test if an index exists in SQL Server.

Here’s what I’ll show you in this post:

  • Example code to check if an index exists using OBJECT_ID. The code is simpler, but it requires a shared schema lock on the table you’re checking.
  • Example code to check if an index exists just using joins. This uses less locking, but is wordier code.
  • Examples of why simpler code doesn’t do the trick.
  • Why the DROP_EXISTING=ON clause in CREATE INDEX doesn’t make this any simpler, either.
  • New syntax in SQL Server 2016 simplifies code for dropping indexes, but we’re still on our own when it comes to checking before creating an index, or for writing reports.

Our Example Index: ix_halp

We want to know if an index named ix_halp exists on the table agg.FirstNameByYear in the SQLIndexWorkbook database. It does! It was created with this code:

CREATE NONCLUSTERED INDEX ix_halp
    ON agg.FirstNameByYear (ReportYear)
INCLUDE (FirstNameId, Gender, NameCount);
GO

Let’s say we’re running code that is going to create the index if it does NOT exist. If we just re-run the create statement, we get the error:

Msg 1913, Level 16, State 1, Line 1
The operation failed because an index or statistics with name 'ix_halp' already exists on table 'agg.FirstNameByYear'.

We have to check if the index exists, or our code will blow up.

index-being-burned-by-dragon-fire

Sometimes you just have to go all Game of Thrones on an index.

Option 1: Query sys.indexes with the OBJECT_ID() Function

This method functions well, but it uses the OBJECT_ID() function. OBJECT_ID() will take out a shared schema lock (SCH_S) on the table we pass into it. Shared schema locks are very lightweight, but it is kind of a bummer to have to lock the table itself if we’re just querying metadata. It’s not always an issue, but I’ve seen scripts cause problematic blocking by using OBJECT_ID too much.

Here’s two flavors of the code. First, using COUNT:

IF 1 = (SELECT COUNT(*) as index_count
    FROM sys.indexes 
    WHERE object_id = OBJECT_ID('agg.FirstNameByYear')
    AND name='ix_halp')
PRINT 'it exists!'
ELSE PRINT 'nope';
GO

Second, a simple existence check:

IF EXISTS (
    SELECT 'foo'
    FROM sys.indexes 
    WHERE object_id = OBJECT_ID('agg.FirstNameByYear')
    AND name='ix_halp')
PRINT 'it exists!'
ELSE PRINT 'nope';
GO

These both do 2 logical reads and are very fast.

Option 2: Query sys.indexes, sys.objects, and sys.schemas (Fewer Locks)

If you want to avoid the SCH_S lock against the table whose indexes you’re checking for some reason, you just need a couple more joins.

Here’s an option using COUNT:

IF 1 = (SELECT COUNT(*) as index_count
    FROM sys.indexes AS si
    JOIN sys.objects AS so on si.object_id=so.object_id
    JOIN sys.schemas AS sc on so.schema_id=sc.schema_id
    WHERE 
        sc.name='agg' /* Schema */
        AND so.name ='FirstNameByYear' /* Table */
        AND si.name='ix_halp' /* Index */)
PRINT 'it exists!'
ELSE PRINT 'nope';
GO

Same thing, but using EXISTS:

IF EXISTS (
    SELECT 'foo'
    FROM sys.indexes AS si
    JOIN sys.objects AS so on si.object_id=so.object_id
    JOIN sys.schemas AS sc on so.schema_id=sc.schema_id
    WHERE 
        sc.name='agg' /* Schema */
        AND so.name ='FirstNameByYear' /* Table */
        AND si.name='ix_halp' /* Index */)
PRINT 'it exists!'
ELSE PRINT 'nope';
GO

These do 11 or 12 logical reads against the internal metadata tables.

Don’t Try This: OBJECT_ID() Doesn’t Work

It seems like an index might be considered an object in SQL Server. But it’s not.

We might try this simple code, but it won’t work:

SELECT OBJECT_ID(agg.FirstNameByYear.ix_halp);

This throws an error, because it doesn’t  understand schema.table.indexname:

Msg 4104, Level 16, State 1, Line 4

The multi-part identifier "agg.FirstNameByYear.ix_halp" could not be bound.

Similarly, this code won’t be accurate:

SELECT COUNT(*)
FROM sys.objects
WHERE name='ix_halp';
GO

Even if the index exists, it will return a count of 0. The index ‘ix_halp’ doesn’t have a row in sys.objects.

 Bummer: CREATE INDEX WITH (DROP_EXISTING = ON) Fails if the Index Doesn’t Exist

It seems like there’s an easy shortcut if you’re creating indexes. There isn’t. If our index does not exist yet and we run this:

CREATE NONCLUSTERED INDEX ix_halp
    ON agg.FirstNameByYear (ReportYear)
INCLUDE (FirstNameId, Gender, NameCount)
WITH (DROP_EXISTING = ON);
GO

Our create will fail sadly with this error:

Msg 7999, Level 16, State 9, Line 1
Could not find any index named 'ix_halp' for table 'agg.FirstNameByYear'.

The DROP_EXISTS clause is only for changing the definition of an index that already exists.  This has NOT changed in SQL Server 2016.

SQL Server 2016 Gives Us ‘DROP INDEX IF EXISTS’

At least index cleanup gets syntactically easier in SQL Server 2016: DROP INDEX gets a new conditional clause to check for existence.

Behold:

DROP INDEX IF EXISTS agg.FirstNameByYear.ix_halp;
GO

Whee! At least that’s nice and easy.