Decoding Key and Page WaitResource for Deadlocks and Blocking

troubleshooting-deadlocksIf you use SQL Server’s blocked process report or collect deadlock graphs, occasionally you’ll come across things that look like this:

waitresource=PAGE: 6:3:70133

waitresource=KEY: 6:72057594041991168 (ce52f92a058c)

Sometimes there’s more information in the massive monster of XML that you’re scanning through (deadlock graphs have a resource list that help reveal the object and index name), but sometimes there isn’t.

Here’s a reference on how to decode them.

All of this information is out there on the internet already in various places, it’s just spread out! I’m going to pull the whole thing together, from DBCC PAGE to hobt_id to the undocumented %%physloc%% and %%lockres%% functions.

First we’ll talk through PAGE lock waits, then we’ll hit the KEY lock waits.

1) waitresource=PAGE: 6:3:70133 ” = Database_Id : FileId : PageNumber

If your query was waiting on a page level lock, SQL Server gives you the page address.

Breaking “PAGE: 6:3:70133” down, we’ve got:

  • database_id=6
  • data_file_id = 3
  • page_number = 70133

1.1) Decode the database_id

Find the database name with this query:

SELECT 
    name 
FROM sys.databases 
WHERE database_id=6;
GO

That’s the WideWorldImporters sample database on my SQL Server instance.

1.2) Look up the data file name — if you’re interested

We’re going to use the data file id in the next step to find the name of the table. You can just move on. But if you’re curious about the name of the data file, you can look it up by using the database and plugging the data file id into this query:

USE WideWorldImporters;
GO
SELECT 
    name, 
    physical_name
FROM sys.database_files
WHERE file_id = 3;
GO

In WideWorldImporters, this is the data file named WWI_UserData, and I restored it to C:\MSSQL\DATA\WideWorldImporters_UserData.ndf. (Whoops, you caught me putting files on my system drive! Oh no! The embarrassment).

1.3) Get the name of the object from DBCC PAGE

We know this is page # 70133 in data file 3 in the WideWorldImporters database. We can look at that page with the undocumented DBCC PAGE and Trace Flag 3604.

Note: I prefer running DBCC page against a restored backup elsewhere, because it’s not supported. In some cases, running DBCC PAGE can cause stack dumps.

/* This trace flag makes DBCC PAGE output go to our Messages tab
instead of the SQL Server Error Log file */
DBCC TRACEON (3604);
GO
/* DBCC PAGE (DatabaseName, FileNumber, PageNumber, DumpStyle)*/
DBCC PAGE ('WideWorldImporters',3,70133,2);
GO

Scrolling down in the output, I can find the object_id and IndexId:

dbcc-page-waitresource

Whew, almost there!

I can now find the table and index name with this query:

USE WideWorldImporters;
GO
SELECT 
    sc.name as schema_name, 
    so.name as object_name, 
    si.name as index_name
FROM sys.objects as so 
JOIN sys.indexes as si on 
    so.object_id=si.object_id
JOIN sys.schemas AS sc on 
    so.schema_id=sc.schema_id
WHERE 
    so.object_id = 94623380
    and si.index_id = 1;
GO

And behold, this lock wait was on the PK_Sales_OrderLines index on the Sales.OrderLines table.

Note: In SQL Server 2014 and higher, you could also find the object name using the undocumented sys.dm_db_database_page_allocations dynamic management object. But you have to query all the  pages in the database, which seems not as awesome against large databases — so I listed the DBCC page method.

1.4) Can I see the data on the page that was locked?

Well, yes. But … do you really need to?

This is slow even on small tables. But it’s kinda fun, so… since you read this far… let’s talk about %%physloc%%!

%%physloc%% is an undocumented piece of magic that will return the physical record locator for every row. You can  use %%physloc%% with sys.fn_PhysLocFormatter in SQL Server 2008 and higher.

Now that we know that the page lock wait was on Sales.OrderLines, we can see all the data in that table on data file = 3 and page number = 70133 with this query:

Use WideWorldImporters;
GO
SELECT 
    sys.fn_PhysLocFormatter (%%physloc%%),
    *
FROM Sales.OrderLines (NOLOCK)
WHERE sys.fn_PhysLocFormatter (%%physloc%%) like '(3:70133%'
GO

Like I said, this is slow even on tiny tables. I’ve added NOLOCK to the query because while we want a glance at this info, we have no guarantee that it’s the way it was earlier when the blocking happened anyway– we’re guessing, so we may as well do dirty reads.

But woo hoo, it gives me a clean display of the 25 rows which the query was fighting for:

physloc-data-on-page-with-resourcewait

That’s enough detail on waitresource=PAGE. What if you were waiting on a KEY?

2) waitresource=KEY: 6:72057594041991168 (ce52f92a058c)” = Database_Id, HOBT_Id ( Magic hash that you can decode with %%lockres%% if you really want)

If your query was trying to lock a row in an an index and was blocked, you get a totally different style of address.

Breaking “6:72057594041991168 (ce52f92a058c)” down, we’ve got:

  • database_id = 6
  • hobt_id = 72057594041991168
  • magic hash value = (ce52f92a058c)

2.1) Decode the database_id

This works the exact same way it did for the page example above! Find the database name with this query:

SELECT 
    name 
FROM sys.databases 
WHERE database_id=6;
GO

In my case, that’s still the WideWorldImporters sample database.

2.2) Decode the hobt_id

We need to use that database, and then query sys.partitions, with some helper joins to figure out the table and index name…

USE WideWorldImporters;
GO
SELECT 
    sc.name as schema_name, 
    so.name as object_name, 
    si.name as index_name
FROM sys.partitions AS p
JOIN sys.objects as so on 
    p.object_id=so.object_id
JOIN sys.indexes as si on 
    p.index_id=si.index_id and 
    p.object_id=si.object_id
JOIN sys.schemas AS sc on 
    so.schema_id=sc.schema_id
WHERE hobt_id = 72057594041991168;
GO

That tells me that the query was waiting for a lock on Application.Countries, using the PK_Application_Countries index.

2.3) Now for some %%lockres%% magic – if you want to figure out which row was locked

If I really want to know exactly which row the lock needed, I can decode that by querying the table itself. We can use the undocumented %%lockres%% function to find the row equal to that magic hash value.

Note that this is going to scan the table, and on large tables that might not be so awesome all the time:

SELECT
    *
FROM Application.Countries (NOLOCK)
WHERE %%lockres%% = '(ce52f92a058c)';
GO

I added NOLOCK to this query (as Klaus Aschenbrenner suggested on Twitter) because locking can be an issue — and in this case, you’re looking to get a glance at the data as it is now, not as it was earlier when the transaction ran– so I don’t think data consistency is a big issue.

Voila, the row we were fighting for appears!

lockres-sqlserver-waitresource-key

Credits and more reading

I’m not sure who first documented many of these things, but here are two posts on some of the less documented nitty gritty that you may enjoy:

Previous Post
Should You Rebuild or Reorganize Indexes on Large Tables? (Dear SQL DBA Episode 19)
Next Post
Where do You Get Your Creativity? (Dear SQL DBA Episode 20)

Related Posts

22 Comments. Leave new

That’s awesome! I’ve used physloc for other stuff but never to flesh out a deadlock to that level of granularity. Have never interpreted those key locks, now I’m going to have to try that sometime…

You could also look in your maintenance history (if you log index maintenance, like Ola’s solution does) to check for a potential reorg or rebuild between the deadlock and when you are actually troubleshooting…you could end up landing on a page in the wrong object if that index got rebuilt in the intervening time, or an unallocated page. Deadlocks are best troubleshooted (troubleshot?) when fresh…

Reply

Super cool!
Thanks Kendra

Reply
manishkumar1980
November 7, 2016 9:59 am

Thanks Kendra.

There’s also Object:dbid:lock partitionid wait resource.

Please also describe that.

Reply

Any idea why I’d get
Msg 468, Level 16, State 9, Line 190
Cannot resolve the collation conflict between “SQL_Latin1_General_CP1_CI_AS” and “Latin1_General_100_CI_AS_KS_WS_SC” in the equal to operation.
when trying the %%lockres%% thing on SQL2016, my local database has the SQL_Latin1 collation.

When I try the same code and technique on an Azure database, it works as advertised.
Thanks,

Reply

[…] you’re looking at individual reports, my post “Decoding Key and Page WaitResource for Deadlocks and Blocking” will help you dig into the nitty gritty […]

Reply
Darko Martinovic
February 27, 2017 1:22 am

In my opinion, the resource name is very important information when analyzing blocking/locking problems. As well as resource content.

Usually, I’d like to see this information as a part of custom blocked(deadlock) process report

That’s the reason, why I developed two scripts, using t-sql & sqlclr

I published both script on

http://www.sqlservercentral.com/scripts/T-SQL/153452/

http://www.sqlservercentral.com/scripts/T-SQL/153467/

Reply
Marcelo Moraes
July 20, 2017 10:04 am

Amazing!

Reply

Hi Kendra,

I am puzzled that I am not able to decode a key lock wait. I could gess the table it has been on by looking at the sqltext. This table had some indexes rebuilded since the blocking occurred. Is that the reason why the hobt id is not there any more?

Thanks in advance

Martin

Reply

Very good article on clearly understanding where exactly the deadlock occured in a volatile OLTP environment/

Reply
INDRAJEET TRIGUNAYAT
March 14, 2018 11:42 pm

Good Post 🙂

Reply

Hey Kendra –

One of my deadlock reports shows it in the waitresource=“KEY: … format and I see that the only the magic hash value is different between the victim and the winner. Now the object itself(using the hobt_id) is a non clustered index on the table and when I use the %%lockres%% function with and without the index(non clustered hint), I get no rows. Why do you think this is happening ?

Also, this index itself has columns in this order – TestNonClustIx – (colA, colB, colC)

Now both the winner and victim sessions have same values of colA and colB. They differ only with colC. Do you think I can avoid the deadlock if I change the index order to TestNonClustIx – (colC, colA, colB)

Thanks!

Reply

    It could be that the exact rows are not there anymore in the index.

    If you change the index order as described, it sounds like possibly the one of the queries might not use the new index at all– because it doesn’t care about ColC, which is now the leading column. While that might avoid the deadlock, it might also slow that query down a lot, which could cause a different problem. So I’d be hesitant of that change without lots more testing.

    Reply

Nnaa Mhen!!! Some people are so hot with SQL Server

Reply

[…] resources when the resource type is a key, a page, or an object (I suggest Kendra Little’s blog post) There is however a noticeable glut on articles explaining RID (a RID is a key on a table with no […]

Reply

[…] wait resources when the resource type is a key, a page, or an object (I suggest Kendra Little’s blog post) There is however a noticeable glut on articles explaining RID (a RID is a key on a table with no […]

Reply

I have a blocked process report with waitresources which included waitresource=Page:…, waitresource=Object:… etc. but also a waitresource which isn’t prefixed by a word e.g. waitresource=”9:1:123456789″ – If I treat it as a Page wait – I get a valid object – so are “no-prefix waits” the same as Page waits or is this a coincidence?

Reply

Late to this party and this is a really esoteric question..I am in the Azure SQL Database world, looking at deadlock detail presented via Azure SQL Analytics. When I see Wait resource: KEY: 12:72057594113490944 (9755ee930a2a) and this were SQL Server I would expect 12 to reference the database_id in sys.databases. In Azure SQL Database that is not the case. What am I missing?

Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Menu