Why ROWLOCK Hints Can Make Queries Slower and Blocking Worse in SQL Server

 

with-rowlick

Sometimes when SQL Server gets slow, developers and DBAs find that the problem is blocking. After lots of work to identify the query or queries which are the blockers, frequently one idea is to add ROWLOCK hints to the queries to solve the problem or to disable PAGE locks on the table. This often backfires – here’s why.

The Theory: ROWLOCK Makes Locks More Granular

The idea behind the change is that by forcing SQL Server to take out row locks instead of page locks, each individual lock will be smaller. If locks are smaller, queries will be less likely to block one another.

How It’s Implemented: Hints and ALTER INDEX SET ALLOW_PAGE_LOCKS=OFF

By default, the  SQL Server engine will select row or page based locks based on what it thinks is best. You can coerce the database engine into using row based locking in two ways: TSQL hints, or by disallowing page locks on an index.

The TSQL hint is a table variant. If you have joins, you can specify this per table, like this:

SELECT fnby.Gender, fnby.NameCount, fnby.ReportYear
FROM agg.FirstNameByYear AS fnby WITH (ROWLOCK)
JOIN ref.FirstName AS fn on fnby.FirstNameId=fn.FirstNameId
WHERE fn.FirstName='Shakira';
GO

If you can’t change the query, the TSQL hint can be applied using a plan guide. (That is, if you can get the plan guide to work. They’re squirrely.)

The ALTER INDEX variant looks like this, and impacts any query using that index:

ALTER INDEX pk_aggFirstNameByYear ON agg.FirstNameByYear SET ( ALLOW_PAGE_LOCKS = OFF );
GO

By disallowing page level locks, the only options left are ROW, TABLE, and perhaps partition level (more on that later).

One Problem: Forcing ROWLOCK Increases CPU Time and Duration if a Page Lock was Preferred

Here’s an Extended Events trace showing the difference in CPU Time and Duration for a SELECT query. The top 10 results are when the database engine chose the locking level (it selected page). The second 10 results have a ROWLOCK hint forcing row level locks against the agg.FirstNameByYear table.

Extended-Events-Trace-Compare-CPU-Duration-ROW-PAGE-Locks

Note that the logical reads are the exact same and neither query is doing physical reads (the execution plans are the same– the optimizer doesn’t care what locks you are using). The queries were run with SET STATISTICS IO,TIME OFF and Execution Plans turned off, just to reduce influencing factors on duration and CPU.

The database engine is simply having to do more work here. Locking the pages in the clustered index is less work than locking each of the 1,825,433 rows.

Even though our locks are more granular, making queries run longer by taking out individual locks will typically lead to more blocking down the road.

If you’d like to run the test yourself against the SQLIndexWorkbook database, here’s the code. (Hips don’t lie, Shakira Shakira)

/* PAGELOCK */
SELECT fnby.Gender, fnby.NameCount, fnby.ReportYear
FROM agg.FirstNameByYear AS fnby
JOIN ref.FirstName AS fn on fnby.FirstNameId=fn.FirstNameId
WHERE fn.FirstName='Shakira';
GO 10

/* ROWLOCk */
SELECT fnby.Gender, fnby.NameCount, fnby.ReportYear
FROM agg.FirstNameByYear AS fnby WITH (ROWLOCK)
JOIN ref.FirstName AS fn on fnby.FirstNameId=fn.FirstNameId
WHERE fn.FirstName='Shakira';
GO

 Another Problem: Lock Escalation

The SQL Server Engine is a little worried about managing lots of little locks. Those locks take up memory. When you hit 5,000 locks on a single table you pass the internal lock escalation threshold and the database engine will attempt to replace your little bitty locks with one larger table level lock.

That sounds great, except for this:

  • Lock escalation is from row to table or page to table (never from row to page)
  • If your query is making a modification, the escalation will be to an exclusive lock on the whole table

It’s quite possible to turn a periodic small blocking problem with page locks into a sporadic big blocking problem due to lock escalation.

Here’s an example against the SQLIndexWorkbook database. We’re updating NameCount for a given report year. Here’s our query starting out…  (Yep, we’re setting NameCount to itself. It’ll still take out the locks.)

UPDATE agg.FirstNameByYear 
    SET NameCount=NameCount
WHERE Gender='M';

SQL Server figured out that it could run this query with an intent exclusive (IX) lock on agg.FirstNameByYear and 4961 exclusive (X) PAGE locks on the clustered primary key. That’s not awesome – it’s most of the table.

But if I change the query to force ROWLOCKS like this, the problem does not get better.

UPDATE agg.FirstNameByYear 
    WITH (ROWLOCK)
    SET NameCount=NameCount
WHERE Gender='M';

Now I end up with SQL Server attempting to escalate locks after it passes 5,000 row locks, retrying the operation every 1,250 new locks. Here’s an example of Extended Events catching a lock_escalation event for this query, which promoted the locks to an exclusive table lock:

Lock-Escalation-Event

Workaround: Disable (or Change) Lock Escalation

You can control lock escalation at the table level using the ALTER TABLE command. In our example, we can do this:

ALTER TABLE agg.FirstNameByYear SET ( LOCK_ESCALATION = DISABLE);
GO

Now, our UPDATE query with the ROWLOCK hint gets 743,750 exclusive (X) KEY locks, 4,961 intent-exclusive (IX) PAGE locks, and one intent exclusive (IX) lock on the object.

In some cases, disabling lock escalation and forcing row locks may work well, depending on the access patterns of your table and how much is being locked. Memory is needed to maintain locks, but the memory for lock structures is very small compared with memory for buffer pool and other components on modern SQL Servers. If you’re concerned about memory use for locks, you can baseline it and test using performance counters:

SELECT object_name, counter_name, 
    cntr_value/1024. as memory_mb
FROM sys.dm_os_performance_counters
WHERE counter_name = 'Lock Memory (KB)';
GO

That being said, I would only disable lock escalation when needed in a controlled setting, when there weren’t better options.

Partition Level Escalation

If you’re using partitioned tables, lock escalation still defaults to TABLE. However, you can change lock escalation so it goes from ROW -> PARTITION or from PAGE -> PARTITION.

That’s done by running:

ALTER TABLE agg.FirstNameByYear SET ( LOCK_ESCALATION = AUTO);
GO

I know, you’d think ‘AUTO’ would be the default. It’s not, because you might run into partition level deadlocks. If you have large partitioned tables and lock escalation is an issue, partition level escalation is worth testing.

Workaround: Tune the TSQL

In our example, the obvious question is why we’re even running this query, since it’s updating a value to itself and not performing a meaningful change. If it was doing something meaningful, it’d be desirable to work on the TSQL to have it perform the modifications in smaller, more controlled batches. This minimizes the lock footprint.

This is typically doable for batch update jobs and behind the scenes processing.

Workaround: Use Optimistic Locking (Snapshot or Read Committed Snapshot) for SELECT Queries

One of my favorite tools to fight blocking is isolation levels. Frequently the blocked queries are predominantly read-only– they’re SELECT queries. Why should those queries have to wait for the update to finish?

SQL Server’s SNAPSHOT and READ COMMITTED SNAPSHOT isolation levels are great for these queries. They can get consistent, valid data regardless of whether a modification is running by using row versioning in tempdb. This is a longer term change, not a quick fix. For transactional OLTP databases, optimistic locking is one of the best tools out there.

Takeaways: Use ROWLOCK Hints Cautiously

I’m not against ROWLOCK hints, I’ve just seen them backfire. If you choose to force row locking as part of your solution, check to make sure it’s not slowing down your queries and keep a close eye on blocking and lock escalation on your tables.

, , , ,

9 Responses to Why ROWLOCK Hints Can Make Queries Slower and Blocking Worse in SQL Server

  1. Maciej Łuszczyński September 13, 2016 at 1:00 am #

    Hi Kendra,

    Awesome article, many thanks!

    I’m wondering if you could advise in more complex scenario, e.g. if there are also some additional indices on a table built upon a column being updated. Does it make any sense to add WITH (ROWLOCK) hint for the update statement? Or the update statement is going to escalate the locks to a table lock for sure and the hint only would cause more deadlocks (which can I see on a live database at the moment)?

    Best regards
    Maciej

    • Kendra Little September 19, 2016 at 8:36 am #

      I hate saying never use something, because very often in practice there does tend to be a use case.

      Ideally, if an update is well indexed and can find the rows efficiently, it should make the decision between row and page lock efficiently. Most of the time allowing that option for it to choose a page lock is just helpful at reducing the chances that it will attempt escalation.

      One note– escalation isn’t for sure, and it can’t always get escalation because of other users of the table. So it might attempt and never actually get that escalation. Or it might just happen sometimes. That “sometimes” is the most confusing part of troubleshooting, I think!

  2. JRStern January 3, 2017 at 4:58 pm #

    Two things. First how many row locks versus how many page locks are we talking about, and is extended events reporting on shared row locks taken as a table is scanned, if page locks are not available? Second, are we looking at microseconds here, so that even the slow execution is on the order of 0.25 seconds (using a parallel plan)? That might still be 10x slower than with page locks, but still fast enough to live with, if there are other reasons to want to avoid page locks. Or maybe not fast enough to live with if it puts a burden on scanning, where we typically don’t think about locking.

    I ask because I have a situation, our main table has a lot of contention and I was seriously looking at disabling page locks, or per this article at least using rowlock on specific queries if leaving it on for the table. If simply scanning a table (which we shouldn’t have to do) or a fat cluster in an index (which we do have) is that much slower when page locks are disabled … that would probably be a stopper.

    I guess these locks are generated by a scan? Because I wouldn’t expect your 2m row database to have all that many Shakira’s to generate that many locks otherwise. Or am I wrong about the count of Shakira’s? I guess I could download your database to answer some of these by inspection.

    Thanks,

    • Kendra Little January 5, 2017 at 1:27 pm #

      “First how many row locks versus how many page locks are we talking about…” This is a small, narrow table, only 45MB. It was 1,825,433 row locks vs. ~5,500 page locks. I was actually surprised that there was a consistent measurable difference on such a small table! The numbers reported here are in microseconds. As your data sizes grow, it can be more significant, but even “more significant” might be small enough not to matter, for many applications.

      My main points writing the post was that most people don’t consider that more granular locks can lead to more blocking (if you take out a lot), and that it makes lock escalation more likely. So while I sometimes will use a rowlock hint, I try to consider other alternatives before implementing them.

      Re-reading the post almost a year later, I’m surprised I didn’t talk about index tuning more as a way to reduce locks. The other workarounds I mention are valid as well, but index tuning can absolutely help reduce the lock footprint. Of course, there’s many environments where you can’t index for every query and you’re going to have scans sneak through, too.

      Essentially, for your main table, I’d try to find the most frequent blocking queries and look at indexing (nonclustered, possibly columnstore stuff depending on database usage and your version), query tuning, and possibly RCSI or snapshot ISO, in addition to the option of changing the locking options allowed. They all have trade-offs, but it’s quite possible you can get better performance overall with one of the other options.

      ps: your comment was quite thought provoking, and make me realize that I want to look into locking implications and configuration options for the newer nonclustered columnstore indexes on OLTP tables. Got that on my list of things to research.

      • JRStern January 5, 2017 at 7:20 pm #

        It’s a bit of an unusual situation, at least for a SQL Server system in this day and age. The clustered PK has four fields, and most of the locking specifies at least the first two, and then one or more additional fields not in the PK. Often there is a “top(@n)” involved, too, and of course embedding in a transaction.

        At some point we will likely change it so it at least has an int identity PK, even if the four fields stay as the CK, because of some more complex challenges. It’s almost a textbook case of why the identity PKs are so common in SQL Server, but it takes a bit of a trip to come to the conclusion – even if the conclusion is actually known in advance.

        Thanks for your great site, always something here to pick up!

  3. Sriki February 13, 2017 at 8:54 am #

    Hi Kendra,

    very well written and more importantly its to the point. I found this site when looking around about queryhints, in particular lock hints. This helped with the understanding a bit better. One question i do have though and its based on the problem im currently trying to resolve.
    Issue at hand: Deadlocks in production, between two stored procedures hitting a common table, one trying to update while the other trying to read. The read sproc has a query hint of ( ReadCommittedLock) which is put in there, i presume, to avoid phantom reads. Due to this hint the deadlock occurs and the read proc is chosen as victim. On looking at the trace i found that there were page locks ( multiple due the max dop set to 6).

    Question: Is there a way to avoid phantom reads in the select proc without using locks that cause the deadlocks?

    Thanks
    Sri

    • Kendra Little February 13, 2017 at 9:04 am #

      Thanks for the kind words.

      What isolation level is the rest of the query run under?

      Typically ReadCommittedLock is used if the database has Read Committed Snapshot Isolation (RCSI) enabled, and for some reason it was chosen to not read the “versioned” data for that particular table. However, read committed is prone to phantom reads. RCSI statements aren’t prone to phantom reads within that statement. Within a larger transaction, you can protect from phantom reads by using snapshot isolation (and if RCSI is enabled, you’re already doing the versioning anyway).

      Since it’s possible the session is using serializable isolation level or something else, I’m not going to go too far down those rabbit holes, though!

Trackbacks/Pingbacks

  1. Beware ROWLOCK Hints – Curated SQL - February 8, 2016

    […] Kendra Little points out that ROWLOCK hints might make blocking worse: […]

  2. Optimizer- Oracle and SQL Server, Hints - DBA Kevlar - June 19, 2017

    […] world than in the Oracle one.  I have to send some love and attention to Kendra Little after I found this cartoon she drew in regards to her frustration with the use of ROWLOCK […]

Leave a Reply