Row Width Impact on Version Store Usage under Snapshot Isolation

on May 28, 2018

A question came up in my webcast today on the topic of Snapshot and Read Committed Snapshot Isolation - what impact will enabling these have on my SQL Server instance?

I talked a bit about how, for tempdb, this can vary widely depending on the number of updates and deletes running against the instance. (This is a simplification, there is more overhead, but this is often the biggest component.)

But after the webcast, I realized that the impact on the version store isn’t simply on the number of updates and deletes run. It will also vary based on the width of your rows – how many columns are in the table, what data types the columns have, and what data you have in the table.

Even if you change one very small column, the entire row is versioned - sort of

My favorite thing about SQL Server is that it includes rich tools to let us learn how the engine works by setting up simple demos. It’s truly awesome.

So I went to work to demonstrate row width impact on the version store – when only a tiny bit column is changed in the row.

Here’s how I did the test:

I created two tables, dbo.Narrow and dbo.Wide. They each each have a bit column named bitsy, along with some other columns.
I inserted one row in each table, but I put a lot more data into the row in dbo.Wide.
I allowed snapshot isolation on the database
I began a transaction in another session under snapshot isolation and left the transaction open (so version store cleanup wouldn’t kick in while I looked around)
I updated the bit column named bitsy for the single row in each table, thereby generating a row-version in tempdb for each table

The code I ran to test this is here, if you’d like to play around with it.

The row in dbo.Narrow is 38 bytes of in-row data

I measured the width of the largest row in dbo.Narrow using sys.dm_db_index_physical_stats. It only has one row, so the max width IS our row width. It’s a lil bitty 38 bytes.

The row in dbo.Wide is 8,054 bytes of in-row data, and 8,012 bytes off row

On the other side of the scale is dbo.Wide. This takes up 8,050 of in-row data (this happened because the ‘j’ column is a char column, which is fixed width). It also is using 8,012 bytes of LOB_DATA storage, because I used the REPLICATE function to put a lot of garbage into the k column in dbo.Wide.

Let’s look at the version store!

We can query the version store through sys.dm_tran_version_store. Querying this in detail on a production system may not be great if there are a lot of versions, but this is an idle test instance and I’ve only updated two rows.

There are two rows in my version store. The Narrow table has 24 bytes of versioning data, the Wide table has 8,040 bytes of versioning data.

Observation 1: my whole row was versioned except for the LOB data

Large object data stored on LOB pages, like that in my NVARCHAR(MAX) column, can be versioned. Snapshot Isolation and Read Committed Snapshot Isolation work with LOB data too, and it’s easy to verify that that they can be versioned by updating the column named ‘k’ in this demo.

But in this case, I don’t see them in the version store.

That’s awesome– because I didn’t update the column that’s stored on LOB pages! I only updated a bit type column on that row. SQL Server is being clever about this.

Observation 2: my versioned rows are 14 bytes smaller

Comparing the size of my row-versions with the rows in the table:

dbo.Narrow: 38 bytes in the table row, 24 bytes in the row version = 14 bytes difference
dbo.Wide: 8,054 bytes of in-row data in the table row, 8,040 bytes in the version store = 14 bytes difference

That 14 bytes is a transaction timestamp/pointer piece of information in row of the data table itself that is an additional part of the overhead of row versioning – but it doesn’t have to be copied into tempdb as well.

Summing up: Wider rows will generally mean more tempdb usage for versioning

Even if you’re just updating one tiny column in your table, the entire row will be versioned – with one important exception.

My testing shows that data stored on LOB pages isn’t appearing in the version store unless I modify that data itself (not simply modifying another column in the row). Note: I haven’t found this explicitly documented while searching, and I only tested this against SQL Server 2017, so there may well be times when this is not the case– but it does make sense that SQL Server could do this quite elegantly for LOB pages.

What should I do if I am considering enabling Snapshot Isolation?

If you have the ability to load test or run a ‘replay’ workload outside of your production environment, that’s absolutely the best bet.

If that isn’t realistic in your world, you can baseline performance on your current production workload, then test allowing snapshot isolation against the database before anything uses it at all. Versioning will still take place, and you can compare performance with your established baseline.