Why You Should Switch in Staging Tables Instead of Renaming Them (Dear SQL DBA Episode 26)

Over the years, I’ve come across a pattern fairly frequently: an application in an OLTP database periodically creates new tables, loads some data into them and fixes it up, then does a switcheroo and replaces old tables with the new tables.

This can cause major problems with blocking if anyone else is querying the table. The problem looks like this:

  1. Innocent query / queries are selecting from the current production table
  2. Switcheroo process comes along and tries to do an sp_rename
  3. Switcheroo process is blocked and is waiting on a schema modification lock
  4. Switcheroo process starts blocking EVVERRRYBODYYYY else who wants to query the table

Sometimes this clears quickly, but it varies depending on how long the queries take and how complex the Switcheroo process is. On a busy system, a big blocking backlog can even potentially cause THREADPOOL waits, which means the SQL Server generally feels like it’s not working for anyone.

This is a tough problem, because you can’t get around it with isolation levels (even read uncommitted/nolock queries require a shared schema lock, which blocks a schema modification lock). You also can’t get around it with optimistic locking.

In the past, I wrote that if you have to do this switcheroo, sp_rename is better than ALTER SCHEMA TRANSFER, but it still has a bunch of problems.

Today, a little light bulb went on. There is a better way to do this than to use sp_rename if  you have existing code using this pattern!

Note: If you can avoid the ‘switcharoo’ pattern altogether and simply create and manage multiple versions of your tables, and have your application use the latest version, that is best because it avoids the locking problem entirely. The trick in this post is for existing codebases where sp_rename or ALTER SCHEMA TRANSFER is already in use, and you need something to mitigate blocking problems in the short term.

Here’s a 12 minute video talking through the optional pattern. You can also scroll down below the video to read a written version of the solution. If you enjoy the video, you might like to subscribe to the podcast. I would also love a review on iTunes!

This pattern works in SQL Server 2014 and higher. And it even works in Standard Edition of 2014.

Some folks will see the word ‘Switch’ in this pattern and assume the pattern that I’m suggesting is Enterprise Edition only for versions before SQL Server 2016 SP1.

However, oddly enough, you can use partition switching even in Standard Edition, as long as the tables only have one partition.

And all rowstore tables have at least one partition! That happens automagically when you create a table.

Here’s the pattern: Use SWITCH PARTITION with WAIT_AT_LOW_PRIORITY

If you’d like to play around with this in full, I’ve got a big old code sample in a gist for you to use on your test instance. But here’s the part that contains the magic:

BEGIN TRAN

    ALTER TABLE dbo.ProductionTable SWITCH PARTITION 1 TO dbo.ProductionTableOld PARTITION 1
        WITH ( WAIT_AT_LOW_PRIORITY ( MAX_DURATION = 1 MINUTES, ABORT_AFTER_WAIT = BLOCKERS ));  

    --Anyone who tries to query the table after the switch has happened and before
    --the transaction commits will be blocked: we've got a schema mod lock on the table

    ALTER TABLE dbo.StagingTable SWITCH PARTITION 1 TO dbo.ProductionTable PARTITION 1;

COMMIT

Some notes:

  • For Standard Edition below SQL Server 2016 SP1, the syntax is a little different: ALTER TABLE dbo.ProductionTable SWITCH TO dbo.ProductionTableOld;  (Thanks to Ben for pointing this out in the comments!)
  • The reason this solution helps is that WAIT_AT_LOW_PRIORITY won’t cause the big blocking chain behind it if this gets blocked. It will sit by the side MUCH more gracefully, even if it can’t get a schema modification lock. (Here’s a Microsoft post that goes into a lot of detail about wait at low priority – if you want the nitty gritty.)
  • This also helps because you get to pick how long it waits, and what it does after that time is up. In this sample I say to kill off the blockers. That’s not going to be a good choice all the time: maybe you’d like it to just sit there waiting at low priority, or to give up itself. You get the choice.
  • You probably do want to use an explicit transaction with this, unless it’s OK that someone queries the table while it’s empty, between the switching. And if that’s the case, you’d want the WAIT_AT_LOW_PRIORITY options on the second switch.
  • This only works when the tables are in the same filegroup. Otherwise it wouldn’t be a metadata only change.
  • You have to create matching indexes on your staging table and production table to make the switching work. (But you were doing that work before you renamed the new table in, anyway.) I kept the sample gist super simple, but you can add indexes to that if you want to see it in action.

What if I want to truncate the data, instead of switching it out?

The TRUNCATE TABLE command doesn’t have the WAIT_AT_LOW_PRIORITY option. Using it would put you right back in your big old blocking chain problem. A DELETE statement will be logged and also has blocking problems.

Instead, you can use the SWITCH pattern above, and then just immediately truncate dbo.ProductionTableOld. As long as that table isn’t being read by anyone, you don’t have a blocking problem truncating it.

Anyone think this is a terrible idea?

There could well be something I’m missing about this solution. Happy to hear about it in the comments if you see a problem!

Previous Post
Bug: Incorrect modification_counter for Column Stats on Tables with a Clustered Columnstore Index
Next Post
How to Find Queries Using an Index (and Queries Using Index Hints)

Related Posts

26 Comments. Leave new

I’d just create a new table and have the app dynamically use the latest table.

So if I there is a table Transactions30000 and processes were scanning that…

Switcheroo process creates and populates new table NewTransactions. At end, sp_rename NewTransactions to Transactions30001

Current processes reading Transactions30000 continue to completion unblocked

New processes start using Transactions30001 which from their perspective simply ‘appears’ ready populated

Periodically have a job drop the ‘old’ transaction tables

Best thing, does not need any funky enterprise features, and can be made to work on any SQL platform where you can query the schema!

Reply

    Great point — and I agree that versioning the tables is preferable to switching whenever you can do it. I’ll put a note in the article that for new code, it’s better to version the tables and avoid the locking issues altogether. When I’ve worked with this before, there’s usually a user table that notes when versions were published and which version is currently active.

    For existing systems that are already using sp_rename or alter schema transfer, I’ve typically suggested going to versioned tables as a long term fix. Unfortunately there’s usually so many code changes involved that it’s tough to get it implemented.

    One clarification: the switch option doesn’t require Enterprise features, either. Switch has apparently always been available in Standard Edition for a single-partition table, oddly enough. (Just don’t want someone to read this and think they need a different edition.)

    Reply

I like this idea!

Another solution I’ve heard of people using to overcome the locking limitation is to use inline table valued functions in front of the tables. You can then just alter the function to switch to the table with fresh data in it.

Reply

    I do like the idea of versioning the table, but if I have longrunning queries using the TVF, wouldn’t I have the same issue with blocking and a schema modification lock when trying to change the TVF?

    Reply

      Yes it does have the same schema modification locking problem… I swear I remember reading or having a conversation that using functions that was less blocking than using views or renames for switching out the tables but now I’m thinking I must have dreamt it. In any event the switch statement seems like a much better approach. Thanks for the article!

      Reply

Interesting… Couple things, make sure you have the compression set to the same between all three tables, or you get a:

ALTER TABLE SWITCH statement failed. Source and target partitions have different values for the DATA_COMPRESSION option.

It also complained that the table wasn’t partitioned even though it did the switcharoo anyway.

Reply

Why not use a view and point it to each new version of the table?

Reply

    Sure, that can work. However, when you alter the view so that it references the new table, you need a schema modification lock on the view. So if long-running queries are using the view, you have just moved the blocking issue from the table to the view.

    I am not saying versioning the tables can’t work — it can work. You just have to handle the data access layer to be more clever about picking up the names of the new tables and using them at the right time. The complexity with that is part of why it’s hard to get it implemented for existing systems using the rename technique.

    Reply

Thanks so much for posting this high quality blog post! I just used this and it works great! The only thing I changed was to remove “PARTITION 1” from both parts of the query as it would give me the warning:

The specified partition 1 for the table ‘database.dbo.tablename’ was ignored in ALTER TABLE SWITCH statement because the table is not partitioned. [SQLSTATE 01000] (Message 4903)

Changing the statement to something that looks like the below alleviated the warning and it still worked as you described.

ALTER TABLE dbo.StagingTable SWITCH TO dbo.ProductionTable;

Thanks again and keep up the awesome work!

Reply

We started running into issues caused by ALTER SCHEMA TRANSFER a few weeks ago, so we are looking for alternatives. Partition switching looks promising, and this is actually something we were using prior to ALTER SCHEMA TRANSFER. However, we couldn’t use this for tables referenced by foreign keys on other tables without having to drop then re-create the FKs outside of the switch operation, and the latter is costly in terms of time and its impact on performance due to the size of the table.

I wonder if there are workarounds for this? What about using synonyms and switching the tables they point to… Could this have the same issues as modifying views?

Reply

I want to note that you need to handle the primary key naming as well. First create a primary key with a suffix of staging on the staging table and then rename so that the next time around it doesn’t exist.

Reply

Never mind my last comment. I confused myself about something.

Reply

Actually, Esther, if you require that the staging table does not reuse any IDs in the oldProd then you will need to set the seed accordingly, so thanks for bringing this up for consideration.

Reply

How about using Synonyms to switch between views or tables in two databases, one being the active database and the other the inactive database to where data and changes would be performed and then using the synonyms switch the tables/views? You would wrapper the drop/create of the synonym in a transaction so it would wait for an application or user to let go the lock and then the switch drop/create would happen in a split second so the next person/app would grab the new synonym now pointing at the other now active database and table/view. Besides the momentary locks can anyone see anything wrong with this idea? All the users will be doing are reading the data from these databases, only the SSIS package will update/insert and then it will be only to the inactive database. Thanks.

Reply

    Check out Nico Africa’s comment and my reply above – this can work, but you can get large blocking chains due the the schema modification locks. There’s no WAIT_AT_LOW_PRIORITY, unfortunately.

    Reply

Thank you for the great post.

I have a question about indexing.

We typically don’t index our staging data, but we do index (heavily) the production table. This leaves our production table unavailable for quite some time. What I’d like to do is do all my indexing on the staging data, and do the swap, such that the newly minted production table immediately has all those indexes from staging.

Is this how it works?

(assuming I have the same indexes on both staging and production tables?)

Reply

Hi ,

Could you please suggest me on below query.

Can we rename the Partitioned table in using sp_rename ( like normal table) command or GUI in SQL server 2014 Version ?

sp_rename “db_name.old_table_name”, “new_table_name”
I seen the below article, It is very good.

https://blogs.msdn.microsoft.com/docast/2015/12/15/sql-server-impact-of-renaming-the-partitioned-table-on-data/
But here my query is why can’t we use the above query to rename partitioned table. what is the Technical reason behind it.

My views are below. Kindly correct me, If I am thinking in other direction.

Object id is referred by DB engine, and object name is referred by user(here admin or developer).

partition Logic is work on Partition schema and partition function.

So in this scenario partition schema and function is the same and only changing is table name.

Kindly Suggest.

Reply

    Hi there,

    I didn’t write that other article (and I don’t have time to read it and test all their demo code) but you can use sp_rename with a partitioned table. It works fine.

    Kendra

    Reply

1) This could be problematic if the business has naming convention restrictions. Sure you can switch the staging table to the original table.. but what happens to the naming of all the indexes and constraints? Don’t they need to be individually renamed anyway?

2) What about tables that have triggers? Do they need to be disabled beforehand? How does that work?

Reply

    Great questions —

    1) When you switch in, you’re taking on the metadata of the table you’re switching to. So the names of the constraints of where you switch to “win” (not the names you had on the staging table). So, should be fine from a naming convention perspective. (Going on memory here, feel free to test tho!)

    2) I don’t recall that triggers add any complexity, per the docs on partition switching, ” No INSERT, UPDATE, or DELETE triggers or cascading actions are activated by moving table partitions, and it is not required for the source or target tables to have similarly defined triggers to move partitions.” Source

    Reply
Tim Cartwright
April 1, 2019 12:18 pm

Kendra, I know this post is a couple of years old but hopefully you can answer this question. According to https://docs.microsoft.com/en-us/previous-versions/sql/sql-server-2008-r2/ms191160(v=sql.105) there is this blurb:

Source and target tables must not be replicated. Neither the source table nor the target table can be sources of replication.

Does that mean you cannot use this technique if the table is currently being replicated in an availability group?

Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Menu