The Top 3 Mistakes I Made Fighting Blocking

on February 16, 2017

At the beginning of the “Troubleshooting Blocking and Deadlocks” course, I mention that it took me a long time to get into using the tools I show in the course.

The tools are all free, and many of them are built into SQL Server.

Update: the course itself is now free, too!

So why did it take me so long?

1. I was using outdated built-in commands to troubleshoot live blocking

Lots of us learn to use built-in tools from the SQL 2000 days and prior when we start using SQL Server. Things like:

  • sp_who2
  • DBCC INPUTBUFFER
  • DBCC OPENTRAN

These commands are feast or famine: they either give you either more than you want, or not as much information as you need.

These old commands haven’t been improved over the years by Microsoft for a very good reason: Microsoft gave us a much richer, more flexible set of tools when they introduced Dynamic Management Objects in SQL Server 2005, and they continue to improve the DMOs year after year. The old commands are still there for backwards compatibility.

In the Blocking course, I demonstrate how you can make finding live blockers fast and easy by querying SQL Server dynamic management objects with the free sp_WhoIsActive procedure.

This awesome free procedure is by Adam Machanic, and you can download it from http://whoisactive.com/.

People still learn old commands like sp_who2 first, and get confused. Just this week, I got a question for my Dear SQL DBA podcast about sp_who2 from someone who’s learning now. Here’s the ‘Dear SQL DBA’ video where I show why sp_WhoIsActive makes life much less confusing:

2. I pummeled the SQL Server looking for blocking with custom scripts

Like a good DBA, I wanted to be proactively notified about blocking. Specifically, before our customers notified me about the blocking.

To do this, I wrote a bunch of TSQL that looked for blocked queries. And I ran it against the SQL Server. I ran it A LOT.

This query alone wasn’t enough to bring down the SQL Server. But of course this wasn’t the only monitoring query I ran. There were more, and hilariously when you got to looking at what were the most expensive overall queries against the SQL Server…

Yeah, of course it was all those custom monitoring queries.

When you’re troubleshooting a problem like “not enough CPU”, or you’re thinking about server sizing and CPU licensing comes to mind, you don’t really want to have the secret that the #1 CPU using queries against the instance are your monitoring queries.

There are better ways! For blocking, I really like using a simple performance counter alert based on the “Processes Blocked” performance counter. It’s really easy to set up, it’s very lightweight. I show how to set this up in the course, along with other tools that will help you figure out who was blocking whom if you don’t get there until after the fact.

3. I thought Trace Flags –T1204 and –T1222 should be enough to fix deadlocks

Trace flags 1204 and 1222 are well known in SQL Server. They’ve been around for a while. Each of these flags causes SQL Server to write information about a deadlock to the SQL Server Error Log when a deadlock occurs. One of them writes info in an XML format, the other just in text.

I knew enough to enable these flags. And I’d see that deadlocks had happened.

But I had no idea how to interpret the information, other than to be able to see little bits of the queries that were involved. I thought the problem was me.

Well, maybe the problem was partly me, but it’s also really crappy to try to output a large wad of text information printed into an error log. It’s like trying to read a book in a foreign language where all the pages are out of order.

I’ve found it much easier to collect and interpret deadlock graphs. There’s still a lot of complex information to synthesize: that’s the same. However, you get a graphic display of the problem, too. This helps you navigate around as you try to understand what’s going on.

It’s also much easier when someone shows you how to decode the foreign language! So in the course I:

  • Give you code to generate a deadlock (and it’s rerunnable, no need to restore the database)
  • Show you how to read the deadlock graph
  • Give you tips on how to plan out solutions to prevent the deadlock from re-occurring