Why isolation levels matter-- and what read committed means (9 minutes)

Part of "Read Committed is Bonkers - Webcast Recording (46 minutes)"


Good morning everybody. Welcome to Read Committed is Bonkers. I’m Kendra Little from SQLWorkbooks.com.

What we’re going to cover

We’re going to talk about what read committed is in this session. What are isolation levels? What is the read committed isolation level, specifically? And I’m going to show you just one of many examples of how you can get incorrect results when running a query in the default isolation level in SQL Server. Let’s talk about why this is the default isolation level and then what you should do next as well at the end.

What I’m showing you today isn’t a bug

‘This isn’t problems that might be fixed with the read committed isolation level. If it was a bug, that might mean it would be corrected or could be corrected in the future.

In fact, what I’m showing you today is one of the many issues that are by design with the read committed isolation level. So this isn’t something that is going to be corrected for this isolation level in the future, and in fact folks at Microsoft, this isn’t a big secret, folks at Microsoft have blogged about this over the years.

As long as 10 years ago, a fellow named Craig Friedman who is in the SQL Server Optimizer team, he wrote a series of blogs about why by design you can get incorrect results when running queries in read committed and how it works. I mean, they not only admitted that it happens, they say, here’s how it can happen.

And part of why they explained that is you do have alternatives to this. They of course do have trade-offs. So we’re going to show today how the problems can happen at a really simple level in read committed.

First up, what is read committed isolation level in the first place?

Isolation levels has to do with how do we handle having multiple users in a database and having the ability for data to change while you’re reading data? When I am reading data, if somebody else’s modifying the data, what happens?

That is what isolation levels dictate.

For that reader, what did they do? That’s transaction isolation.

Now, read committed is so important because it is the default isolation level for SQL Server. There is an exception which we’re going to talk about a little bit at the end. Azure SQL Database which is a hosted database service, the default isolation level you get with Azure SQL Database is different. They changed the default isolation level for that, and I think there’s good reason.

Now you can change the isolation levels in your own database and in your own code. So you can choose not to use read committed, it’s your choice, but it’s important to understand the choice that you’re making first.

Here’s an example of transaction isolation

If you are modifying data, let’s say you’re running an update on some data and I at the same time want to go in and read that data. Well, what should I do? Should I wait until you’re done?

Should I read the data while you’re changing it? Should I do something else entirely? That is the question that isolation levels answer. You control the isolation levels for your sessions against SQL Server, for the most part.

There is a few little exceptions where you don’t get to pick, but for the most part, you control your isolation level. And you get to set your isolation level.

So you get to decide: should I wait until you’re done, and by accepting that default isolation level of read committed, you are making a choice to stick with that default isolation level.

Under read committed, while you are updating that data, I say, okay, I only want to read data that has been committed. So when you’re in the process of your update, and the read is, you’re writing data, it’s not committed yet. I’m like, oh no no no. That’s not committed. I am going to wait until you are done.

Similarly, if I’m reading the data right now, and then you come in and you want to update the exact data that I’m reading, I, under read committed, as I read the data, I will take out some locks protect that data so that you can’t just change it at the exact time that I am reading it.

Read committed uses a mechanism of locks for a protection

When you choose an isolation level or choose to go with the default, your isolation level will determine, is it going to use locks to protect data when it reads it. If so, how many locks will it use, what kind of locks will you use, and how long should you hold those locks?

When you choose to stick with read committed, you say, okay, when I’m reading data, I am going to take out locks to protect it but I’m just going to take out, I’m not going to, like if I’m reading a page in a table, I’m not going to lock the whole table so that nobody can look at it at all. I’m going to go lighter.

I’m going to read committed data, but I’m just going to really try to protect what I’m reading right now.

And I’m going to use some shared locks.

Now, I am going to be careful to make sure that the table’s not dropped when I’m using it. My locks will protect the object from being dropped, but also the rows or pages that I’m reading right now, I’m going to use the shared locks, so that nobody can modify them while I’m reading it. But other people can read it too.

Read committed uses locks, but the locks can be pretty cooperative—

Because in a lot of databases we have more readers than we have writers. We may have a bunch of people wanting to read a page, but only one person wanting to write that page, so read committed tries to help the reader share and be able to read it while keeping the data from being protected.

And it doesn’t hold the locks for a super long time. I’m only going to love the page I’m with, and lock it. I’m not going to lock out an entire range of stuff that I don’t happen to be reading right now. I’m going to lock what I read as I read and then let it go.

Read committed likes to let it go

Here’s an example. What does this look like? We haven’t had many helpful drawings yet.

For example, let’s say we have a table that just holds a list of unique first names and some information about them. This is for baby name data in the United States, and this is not a very big table. The table is 3.6 megabytes, it has 90,000-ish rows here. The number’s gone up since the number was put on this slide. You’ll see in a minute.

And we’ve got first names in here. These are unique names of babies born in the United States. And we create a non-clustered index on the first name column. Notice that on the left of this slide, we have names that start with a, Aaban is the first name and then on the right of the slide we have names that start with z, z, y. This is, the first name is the first key in this index, so our non-clustered index is an alphabetical list of first names, ‘cause if we go back and look at our table, the table itself has a clustered primary key on FirstNameId, which is the surrogate integers. The table itself is just ordered by FirstNameId.

We have now created a non-clustered index on the first name column itself, it is just the one column and the key of the index that has to be created.

If we scanned this index, in the read committed isolation level, and I am just reading through the index, scans can either go forward or backward. This one happens to be a forward scan. So it goes over to the a’s and the bird here is our reader. And notice that it’s holding a lock in its mouth. As it scans the leaf of this index, it locks the rows as it goes, and it’s only locking what it’s reading, and then when it’s done, it lets go of those locks. It does that even if I say begin tran.

And I’m in read committed, and I do just this read, I lock the rows as I go and then I let them go. That means that I am sharing, these are shared locks. Other people can read even while I’m locking, other people can read the, the thing is I’m protecting the data that I’m reading. As I’m reading it, I am protecting it from being modified.

Now, that sounds great, right? Nobody can modify the data while I’m reading it, but it means that problems can happen with concurrency.

And let’s take a look at that in a demo of how just locking, just using share locks to protect what I’m reading can lead to reading data that is just plain wrong.