Verifying the results - and a recap (4 minutes)

Part of "Speed Up the "Popular Names" Query SQLChallenge (46 minutes)"


Before we celebrate and have too many tacos, let’s double-check our data

It never hurts to make sure we’re getting the same data back.

I can see that I am getting 10 rows back. Our results said at the beginning had 10 rows, and my rewritten query also has 10 rows. Now that’s a good sign but let’s do a deeper check.

I’m going to use EXCEPT

The first query here, this is our original query with the local variable here, and the only thing I’ve changed about it is I commented out the order by, because what I’m going to do is I’m saying, run this first query and I’m going to highlight EXCEPT and run this as one batch.

Run this first query and give me everything back except what has a direct match in the second query. And the second query here is our rewritten TSQL version of the query.

Show me all of the rows in the top query that except for the rows that have an exact match, same data in the same columns in the second query so if we get zero rows back, everything matches, and everything does match and the fact that we know that there are 10 rows in both data sets is pretty promising as well. That looks good for our TSQL rewrite there.

Quick recap of tuning approaches that we took

There are quick fixes to this solution. Combining recompile with rewriting those joins and simplifying the join. Or just using literals does meet the challenge and get you possibly quickly if you do that real fast to a fast performing query.

The cons to this approach are simply that these tuning tricks of using recompile or using literal don’t scale well in most production environments these days.

There may be times where it’s fine in that environment because you just don’t compile a lot, but in a lot of environment we want to have parameterized, reusable execution plans whenever we can.  There may be exceptions when we can’t. We found more reusable solutions by figuring out how to rewrite out code to push that predicate into the query that ranks.

This did require a longer query that is harder to read, but we go a more reusable plan while getting under that 500 logical read challenge mark.

The tuning process that we use to get there was based around asking, “Where does this query primarily limit or filter the rows?” For many queries limiting down the data set is a critical first step. Now it varies in some queries, we’re working with very large data sets, and those can be different, but in this case we’re only looking at specific years, so focusing in on where we filter in the data and how can we do that in the most efficient way was key for that, and isolating one part of the query and reproducing and solving the problem there set us up to then fix the rest of the query, add it back in stages, rewrite it and add it back in stages, and as we add back in additional joins in the query I also added in their columns and the select at the same time and checked the performance as we went so that I didn’t have to add in a lot of stuff at once and then be shocked and not know where to go if the performance changed.

Thanks so much for joining me for this SQL challenge. I would love to hear comments on the pages in the course about solutions you came up with as well as any questions that you’ve got. And I hope that you’ll join me for future SQLChallenges as well. I’m Kendra from Bye folks.