What happens if I update a value used as a ‘guidepost’ in the intermediate or root pages?
You might wonder, as I wondered, what happens — this row is very important right? It’s a guidepost that we’ve seen in three levels of the index. What if we change her LastName?
Changing her LastName is going to move her in the index. The index is sorted first by LastName, so if we change her LastName, and prefix it with an A, she’s going to move to the A’s in the index. She’s going to move to a totally different page.
Suriya is currently on page 158449
Now she is currently– let’s just prove this– we’re going to mark her page number. She is currently on page ID 158449. I’m going to go ahead and update the table and change her LastName so that she now has a triple A prefix. She has been moved. Let’s see what happened.
We’re going to go way back up and look at the root page of the index and see what has changed.
The guidepost values didn’t change after this update
Oh this is interesting! Looking at the root page of the index, we still have two intermediate pages beneath it. But the thing that’s interesting is we still have the same guidepost for the data.
If you think about it, this might at first look wrong. You might think that means that the data wasn’t updated. No the data WAS updated, and that row did move, but think about, on the root and intermediate pages, these values are…
Let’s say you created a binder
And you had a page for all these different LastNames. You might have little tabs to make finding something easy. You might have little tab indicators for different parts of the alphabet that let you say: hey after this point are the “QUE’s”, or whatever you put on those tabs.
That LastName, the thing you’re alphabetizing, the data may change and it may move around in there, but that doesn’t necessarily invalidate your tabs.
Because your tabs are just guideposts to tell you where to go.
This tab, we don’t HAVE to have a row with this exact LastName present. As long as the rows are ordered in a way that says, okay, on this child page, it contains rows with this value and higher. As long as that rule is still followed, it’s fine for this exact value to not be on that child page.
Let’s step through and make sure that the data really was updated correctly, and see what happened to these pages
We changed the LastName to start with an A. To get there, we should follow this first intermediate page. We’re going to look at that first intermediate page, and a LastName starting with AAA should be on this child page, because alphabetically that is before Abrus.
We’re going to copy out this child page ID where we should find our updated row, and we’re going to plug this in and look at that page where the new row should be. Sure enough, the row is here. The update worked. Here is our friend Suriya with her new hyphenated LastName. The page previously had a row for Kim Abercrombie on it. Now Kim is no longer the first alphabetically by LastName. She has a new friend on this page.
What about the page where Suriya used to be?
Let’s look at the second intermediate page, that page that has the old guidepost for her. Sure enough, just like the root page, this has the old guidepost, but note that it says for values with this name and later, go to page 158450. We’re going to copy that out. I did copy out the page ID before.
This is the page that Suriya used to be on: she used to be on page 158449.
When we moved her row, we left that page empty.
SQL Server said: I don’t need this empty page anymore. It didn’t go to the trouble of updating all those guideposts, it didn’t rewrite the little label that helps it find the rows faster, but it said: okay we can just… we can get rid of that page, we don’t need it. We’re going to go straight to the next page.
When I go look at the page, 158450, that’s now marked by this guidepost, it has the next name alphabetically on it.
SQL Server’s very smart about efficiently keeping the little markers that help it seek to an individual page and only updating what it has to absolutely update in there for efficiency purposes
Why change the label on this guidepost if we don’t need to? But it is smart enough to not just leave empty pages in our index, which is great news.
The guidepost information, or those little labels in the root and intermediate parts of the index, they are not the data itself, they are there to help SQL Server efficiently find the data that it’s looking for in the leaf. Or if there’s no rows for it to find, to be able to know that there’s no rows.
Let’s change our index structure a little bit now
I’m going to keep the same key columns and the same included columns, but…
I’m going to run an ALTER INDEX rebuild and change my FILLFACTOR to 80
This is still a low FILLFACTOR saying leave 20 percent free on this index. This is a small index. In reality I would leave it at 100 percent full, but for the purposes of this demo, I’m doing this at 80 percent just so that I get a small tree that we can use to trace seeks on, but that still has a few pages.
Now that I’ve rebuilt the index and I’ve said: on the leaf of the index you can fill it up 80 percent, let’s look at the sys.dm_db_index_physical_stats dynamic management view to see how many levels we have.
Sure enough, it is much more compact.
Index level zero is the leaf of our index, and it only has four pages in it now
Notice the average fragmentation in percent. It is an indication of: are these pages in order? It’s 25 percent fragmented even though I just rebuilt it. That is normal when I have such a low page count, like four. It is not a cause for concern.
I don’t have an intermediate level this time, I don’t need it
So I just have a root page and then the data is spread out on four child pages. The leaf will always be level zero and then it just counts up from there. So my highest level number is level one.
And I have two levels now of the index. My leaf is only 77.8 percent full because I asked for FILLFACTOR 80.
My root page has a much lower page space used because it just doesn’t have much information on it. It only needs to track four child pages. Let’s take a look at what our page numbers are now.
Using the sys.dmdb_database_page_allocations view, just like before, here is our root page. It’s now 158584. We’re going to plug in that root page and look at it.
We can see, yes, our leaf has four pages in it.
Rebuilding the index did change all our little guideposts
We don’t have intermediate pages like before. Doing an ALTER INDEX rebuild rebuilds all level of the indexes, so now it says, okay, here are my new guideposts. The second child page starts with Chavda, FirstName Ankur, and everything before that is in the first page, and so on.
This index structure, if we start looking at the child pages, we can now see that just like before: our index is shallower, but on our leaf of the index we have the key columns as well as the included columns. Here we have MiddleName, EmergencyContactName, and Status. The difference is that because I don’t have that ridiculously low fill factor of one, I now have many more rows of data on these pages.
Let’s take a look at how this index is laid out real quick in the slides.