Procedures vs ad-hoc TSQL (11 minutes)

Part of "The Case of the Slow Temp Table: A Performance Tuning Problem (50 minutes)"

Extra credit: dig into estimates in ad-hoc TSQL

One difference about running the code as ad-hoc TSQL, outside of a procedure, is that @RowID is a local variable in the ad-hoc TSQL. The SQL Server optimizer does not treat local variables the same way it treats parameters.

The value of @RowID will not be “sniffed” for a local variable. Instead, SQL Server estimates 300K rows in this case when the ad-hoc TSQL is run with the local variable @RowID.

For extra credit, compare the IO and row estimates in the actual execution plan for the temp table example in the demo scripts in section “4. What if it’s a batch of TSQL, not a proc?” when run with:

  • @RowID
  • A literal value of 8000000 (no @RowID variable or parameter)


One thing to think about when you’re troubleshooting is the scope of the problem and where it occurs.

This is worth thinking about because if you’re troubleshooting a stored procedure, sometimes you won’t be able to reproduce the problem unless you’re testing a stored procedure as well. Ad hoc TSQL outside of the context of a stored procedure may behave quite differently.

Sometimes when I get into a situation where I’m testing something and I’ve got a procedure that’s acting funny, one of the things to think about is: is there a way I can work around this by changing scope? Is there a way I might be able to work around this by having some of code execute in a context that’s not a stored procedure?

Let’s try out some adhoc TSQL

Let’s first just check out some sample code that runs as a simple batch of ad hoc TSQL and see if we can reproduce the same issue that we’re having with the stored procedure. For this demo what we have is the code that’s in the stored procedure, but taken out of the procedure so that now it’s just going to run as TSQL in our session directly.

First up is the code formerly known as UserDatabaseTableTest

I’ve simply modified it a little bit so that we’ve moved the parameter out of the stored procedure context and made it {an anonymous} user variable here in the code. I’m going to run this first without execution plans on, and I’m just going to measure that statistics IO output. First we’re running just with the “permanent” user table, we’re populating once again a million rows. That’s our first step there, and then we are updating 800,000 rows.

Our duration is very similar to what we had for the stored procedure in this case. We’re at 1.3 seconds and we have a nice familiar amount of logical reads, about 53,500 logical reads. For the “permanent” user database table we do not see a big difference when running this outside of the context of the stored procedure.

Next up we have our temp table

Similarly, I’m runing it without execution plans first. Same code that was in the stored procedure, just adapted so that it’s no longer in a procedure, and we will be updating the temporary table, we are doing 800,000 rows. Running this code against the temporary table, what do we see? Well, things are kind of different here.

Our duration of the update is no longer five and a half seconds, this took us simply one second. Let’s add in a little return here so that we can see our logical reads more easily on the screen.

Huh, this is how we want our temp table to act, right?

What we would expect from our temp table is that it would require about the same number of logical reads as a “permanent” table. As I’ve taken this temp table out of the context of the stored procedure, and I’m just running it in a batch, I’m starting to see what I would expect to see but am not seeing from the stored procedure.

Let’s compare execution plans

I’m going to go back to my “permanent” table

I’m running the update now with execution plans on. Once again I get a duration of 1.3 seconds and my expected amount of logical reads, so that’s definitely reproducible.

What do I see for the execution plan? I get a wide update plan for the user database table, and that’s not too surprising. I did get a wide plan for the stored procedure working with the user database table when it compiled for 800,000 rows. We’re seeing on the clustered index seek we’re seeing about 53,500 logical reads. And then on the clustered index update just like when we saw when this was running the procedure we have zero logical reads.

Makes sense! Very similar.

How about for our temp table?

We’re going to run this now with actual plans. This was much faster. Our duration here is much faster, and once again this time it took 1.3 seconds. Last time it took about a second, much faster than five plus seconds right?

We once again got a total of 53,000 logical reads.

Did it accomplish this with a narrow plan? Well let’s see. In fact it used a wide plan. This wide plan was okay. Looking at the clustered index seek properties our actual IO statistics on this clustered index: this is behaving like a quote, unquote, “real” table, like a “permanent” table. Looking at the clustered index update operator, the actual IO statistics are zero there.

Sometimes a wide plan is fine.

The wide plan that we’re even getting for the temporary table is fine as long as we aren’t running it in the context of a stored procedure.

The plot thickens, right?

Let’s take a look at our other two examples, we’ve still got the two table variable examples

First, the “naked” update against a table variable (no recompile hint)

I’m going to go ahead and run this with actual execution plans on. We may inflate the duration a little bit, that’s okay.

We’re at about a second. The temp table that doesn’t have a recompile hint on the update, it’s always speedy, it’s always fast, and it’s just doing that desirable 53,000 logical reads there. It loves narrow plan, and if we hover over this we can see why it loves narrow plan.

The table variable, the naive table variable without a recompile hint on the update, it’s like, “Oh I don’t know how many rows are in there,” “maybe just one, maybe just one row.” And so it does a really narrow plan that would be suitable for updating not many rows.

But it turns out really well even when it’s updating 800,000 rows in this case.

A badly estimated plan may turn out well.

Sometimes we get lucky, right?

What about the table variable test using a recompile hint?

Adapting that so that it’s not in a procedure and running that, this is the one that, always in the past in the procedure, it loves a wide execution plan, and it’s always had inflated IO. This poor guy. It’s slow even outside of the context of a stored procedure.

It’s 5.3 seconds. It has 4.8 million logical reads. When we look at the execution plan, our clustered index seek, let’s look at it’s actual IO stats. It’s doing 2.45 million logical reads here.

If we look at our clustered index update operator, we are doing an additional 2.4 million logical reads there as well. So this one, for our specific pattern, the table variable when we’re using a recompile hint, it has inflated IO whether or not it is in the context of a stored procedure. It is just pathological when it comes to those logical reads.

Summing up our duration tests

This chart compares the duration for each of these different tests when they’re in the context of a stored procedure and when they’re in the context of a batch that isn’t a stored procedure.

  • The user table is always fast
  • The temp table is slow in the context of the stored procedure, but not in the context of the batch.
  • The table variable is always fast
  • Except if we use a recompile hint on the update it is always slow.

Similarly, looking at the IO, we have this inflated IO when we use the temp table in the context of a procedure, and we have inflated IO for the table-variable- updated-with-recompile-hint however we’re using it, either as a procedure or as an ad hoc batch.

Looking at the execution plans, we definitely have cases here where wide plans can be fast.

  • For the “permanent” table, wide plans are presenting no problem for it at all, it doesn’t have issues with extra IO, it’s always fast.
  • For the temp table, the wide plan is slow in the context of the procedure, but it’s not that it’s faster as ad hoc SQL because it gets a different shape plan.

It just acts like the regular table with a wide plan outside of the context of that procedure.

So, really, more continuing evidence that it isn’t that wide plans are bad for temp tables.

Wide plan seems to be bad for temp tables only in the context of a stored procedure.

This is kind of a complex thing!

There’s one pattern we have that’s just always slow. Updating the 800,000 rows in the table variable with a recompile hint is always slow.

But other than that, there is something when it comes to the temp table there is something to do with scoping.

We don’t have the problems with this temp table outside of the context of a stored procedure.

You may be saying to yourself, “This seems buggy as all heck.”

In fact, yes, this is not how this is designed. There is no Books Online article that says, “Oh it’s a best practice to only use your temp tables as batch code.” No, I would expect the temp tables to behave well even if we’re using them in a stored procedure.

More on the bug front soon, but next up what we’re going to look at is demos of: okay, this is buggy as heck, but I need to fix my code now.

What are my options? What can I do not to speed up this temp table and make this temp table faster?