Automatic isolation level escalation to Snapshot (15 minutes)

Transcript

Welcome to Snapshot Isolation Against Readable Availability Group Secondaries. I’m Kendra Little from SQLWorkbooks.com.

What we’ll be demonstrating

In this session, we’re going to look at an availability group that’s set up in a test environment. This particular availability group is built on top of a Windows failover cluster. Now these days, that isn’t your only option when it comes to deploying an availability group, but it is still a very widely used choice, and that happens to be how I have built out this test environment. My availability group just has one database in it, and I have two replicas in this test, my primary replica, the database there is readable and writeable, and I have configured my secondary database so that it is readable.

Now, to make readable secondaries work, there is a lot of magic that goes on behind the scenes in SQL Server. Magic that tries to, as much as possible, make those reads fast against the secondary, and also, a lot of this is in an attempt to lessen the impact of having a readable secondary, and keeping it up to date to lessen the impact that that can have on the primary replica as well.

This magic, we’re going to look at a lot of it. It goes on behind the scenes, and it’s not always obvious when it happens. So the first we’re going to do is, in our demo, we’re going to take a look at how when we run queries against our readable secondary, our isolation level gets automatically escalated to snapshot isolation, and it not obvious at all that it’s happening, but we can demonstrate the effects of it, and we can monitor it as well, which can be really useful because long running queries against our readable secondary databases can have some impacts on that primary replica.

We’re going to take a look at what the low watermark for ghost is, where you can see that, and talk about what it means as well.

Let’s dive on in now, and take a look at this test environment

My test environment is build on SQL Server 2017, and here are my replicas in my availability group. “BEEPBEEP” is the name of my primary replica, and I have an instance named “DEV” on there. “BOOPBOOP” is my secondary replica, I also have an instance named “DEV” on there as well. I have asynchronous commit configured for this particular availability group. So first off, we are going to connect to the readable secondary.

And let’s just take a look at what our databases look like in the metadata. We are querying here at first, “sys.databases”, and we’re joining to “sys.dm hadr availability “replica states”. And we just want to look at our user databases here, and see how they’re configured. The database that we’ll be working with today is “ContosoRetailDW”. This is a free sample database that you can download from Microsoft, restore it, and put it into an availability group. I have, after restoring it, our script does raise the compat level on that database to 140, and this is interesting when we look at the metadata. This “is_read_only” is coming from “sys.databases”. When we’re querying our readable secondary, “is_read_only” says zero. Well, that’s because it’s mirrored this information over from the primary database which is not read only. Our readable secondary is read only, but when I’m looking at some of this metadata, it is just a copy of what I’d see on the primary replica.

We can see that on our primary replica for this database, “read_committed_snapshot” is not enabled, and “snapshot_isolation” is also not enabled. On the primary replica for this database, versioning is not being used, and snapshot isolation is definitely not being used because it is not enabled. When I query the readable secondary of this, my isolation level is going to be automatically escalated to snapshot, even though it isn’t enabled on the primary. That escalation will just happen for me magically on the secondary.

Now over here on the right, I have some columns from the “HADR” view, here. Columns that are specific to my availability group. And I can see that for my Contoso database, this guy in the top row, it is. I’m connected to the secondary who is online. Alright, so I see that the primary is not read only, but I am connected to the secondary, so it is read only. Well, let’s use that database. So we’re now using the database who is a readable secondary.

Can we tell that our isolation level is going to be escalated to snapshot? It’s a little tricky.

I am looking, now, at DBCC user options. This is a command in SQL Server which will tell me a lot about my options for my session, including what my isolation level is. And it still shows me that my isolation level is read committed. It does not say here that “your isolation “level has been changed to snapshot.” The isolation level for my session is read committed, but when I run queries, they are going to be escalated to snapshot. But I don’t get any information for that out of DBCC user options. Well, let’s start up a query and see what the dynamic management views look like. I’m opening a new session against our secondary. This is BOOPBOOP, and I have it configured so that I have an orange bar show up for my secondary.

What I’m doing here, is I’m starting a transaction, and running a query, and just leaving it open. So now SQL Server knows this session has started transaction, and it’s run this count, and currently, the count of rows, in fact online sales for the date key, January first, 2007, is 11,242 rows. Alright, that’s great. So I’ve got a little activity that’s currently outstanding against my readable secondary.

What does its isolation level look like, though?

I’m running here, the procedure who is active. This is a free diagnostic procedure written by Adam Machanic, you can download it at whoisactive.com It is a great way to query the dynamic management views against your instance and say “what’s running against my SQL Server?” And here is, against my readable secondary, session ID 66, is my other session, and it’s been going for 30 seconds. Here’s the query that it’s running, and I’ve run “Sp_WhoIsActive” with the parameter “get_additional_info”. “get_additional_info” returns a column named, you guessed it, “additional info” that contains information seen from the dynamic management view about my session.

It includes things like date format, quoted identifier, different settings like that, it has, in here, the transaction isolation level for my session as seen through the dynamic management views. Just like DBCC USER OPTIONS said, my session is in the Read Committed level. My queries are getting escalated to snapshot isolation, but it does not show up here. So this is not how we see, if we were just looking at this, we would say “What isolation level escalation?”

Let’s update some rows

Let’s make things more interesting. Let’s start some modifications against our primary replica. I’m going to copy this query out, and now I’m connecting over to BEEPBEEP, who is my primary replica. I have this configured to a blue bar down here when I’m connected to BEEPBEEP, and we can also see it in the tab of my query. And now I’m opening another transaction that I’m leaving open.

On the primary replica, I am updating the same rows in fact online sales that I’ve already queried on the secondary.

I’m saying, for all the rows where the date keys January first, 2007, I want you to update them and set the date key January first, 2005, and sure enough, it finds all 11,242 rows, and it says “Oh, we’ve corrected the date key for that, “It’s now January first of 2005.” Well, alright. If I’m back against my secondary now, and I go over to my open transaction here, who has already counted the rows where they were 2007, what if I count them again in this transaction? What will I see here? Counting them again in this session, I now see 11,242 rows still for January first, 2007. Now, let’s think this through. Against the primary replica, I have updated these rows, but I have not committed it. I haven’t told SQL Server that I really mean it. This could be rolled back at any time. On my secondary, I’m not doing a dirty read, here. I’m not saying “read uncommitted data.” I’m saying “no hints at all, here.” So, even if I was just reading committed data, it would be correct to see 11,242. And even if I count them in another session, I don’t have an open transaction.

This is running against the secondary, I don’t have an open transaction, I should see the data that is committed right now, whether or not I’m using read committed or snapshot isolation, and sure enough, I do. Well, what if I do ask for dirty reads? What if I say “I want to see data, “whether or not it’s been committed against the secondary.” No other modifications are happening against my primary, there has been plenty of time for that data to come over, even asynchronously.

What if I use NOLOCK?

But hey, I say dirty reads, and I still see only the committed data. By the way, that “no on can’t”, it was actually just ignored. The readable secondary looks at that “no on can’t” and it’s like, “Oh you don’t really mean that.” Which I think is a great feature. You may wonder, maybe the rows haven’t come over yet. Let’s check on that. We can look at the version store on the secondary.

Exploring the version store

Part of having your isolation level escalated to snapshot means that when data’s updated and deleted on our primary replica, a copy of the previous version of the data will be stored in our version store on the secondary, so the readable secondary, so that we, if we’re reading data in a transaction, and you’re automatically escalated to snapshot, we need to see data that is consistent with the first point in time that data access happened within our transaction. So those versions help make that work. And to see, and prove that that’s what’s happening, we can look at sys.dm_tran_version_store. Now, you don’t really want to run queries in detail against a real production environment because it really is giving you rows for every version that it’s got, so that can be, in a real production environment, very prolific. But in my little environment, hey check this out, version sequence number 11,242 here. I have 11,242 rows in my version store on my secondary. The modification has been sent over here, and the version of the data that said January first, 2007, has been captured in my version store.

It’s just that the data hasn’t been committed right now, and SQL Server will not let me do dirty reads. Okay, well let’s commit that update against our primary replica. Let’s make this real against the primary replica, and see how things now behave on our secondary. So on our primary, I’ve said, “Really? “These rows are now January first, 2005.”

Back against my secondary replica…

In my query who has the open transaction, it has always seen that there are 11,242 rows for January first, 2007. We’ve still got our open transaction, here. And when I run this query again, now that the data has been committed, it still doesn’t report that to me, because this has been escalated to snapshot isolation, and I have an open transaction. Under snapshot isolation at the first point that I read from data in my transaction, after that I need to get data consistent with that point. And at this point, there were 11,242 rows for this particular date key.

So even now that a change has been committed, I am seeing data consistent with what I queried at this point.

And if I’m running a report that has multiple parts to it, and I want them all to be consistent with one another, and they’re collected by different queries that run serially, that is a very good thing. I like that. Well, alright, that’s really cool.

How about in our other session?

This session does NOT have an open transaction

I’m querying the availability group secondary, and in that session, since I don’t have an outstanding transaction, right now, just for this query, who’s in autocommit mode, I haven’t explicitly said begin tran here, this query says “Oh, well I’m looking “at committed data at this moment, “because I’m not part of a transaction “that was opened before, “I don’t see any rows for that date.” because the change that committed updating those rows to January first, 2005, it has committed, and I see that on my readable secondary. So we can see here the effects of being automatically escalated to snapshot isolation.

This explicit transaction is showing us, I am seeing data consistent with when you first access data

That is not what would happen under the read committed isolation level, that’s what would happen under our snapshot isolation level, and we see those versions that it’s using to do it. But this guy’s still open. And those versions are going to have to stay around to support this, and if we make a lot more changes, we may get a lot more versions. How can we tell that we’ve got this transaction sitting there open for a while?