Find the Scalar Function Call: What it Means If It's Hiding in Probe Residual

on July 2, 2018

User defined functions are fairly simple to create in SQL Server, but figuring out exactly how they are being used can take a little time - even in a simple execution plan. This is because the functions can be tucked away into lots of different operators in the plan, even join operators.

The good news: once you learn what to look for, it’s not terribly complicated. There are some fancy terms involved, but the definitions are pretty simple.

Here’s a quick look at a function call hidden in a hash match operator, which is being used to implement an inner join.

The estimated query execution plan

This query uses the Microsoft WideWorldImporters sample database. The query calls the built in scalar user defined function, Website.CalculateCustomerPrice:

SELECT
    il.InvoiceLineID
FROM Sales.Invoices AS i
JOIN Sales.InvoiceLines AS il
    ON i.InvoiceID=il.InvoiceID
WHERE 
    [Website].[CalculateCustomerPrice] ( i.CustomerID, il.StockItemID, i. InvoiceDate ) 
        <>
     il.UnitPrice;
GO

Here’s the query execution plan:

There’s not a lot of places for the function call to hide in this particular plan, right? No Compute Scalars to look into at all.

If we look at the properties of the Hash Match join, we can find the function call:

How does the hash join work?

The description at the top of this tool tip is quite helpful when it comes to remembering how hash matches work: SQL Server is building a table in memory from the Sales.Invoices table. It’s then “probing” into it with items from the Sales.InvoicesLines table.

The “Hash Keys Probe” part of the tool tip tells us that it is looking for matches based on the InvoiceID column (our join column).

What about the ‘Probe Residual’?

After we find matches based on the customer id, we have more work “left over” – that’s the “residual” bit.

For every row that matches, SQL Server is plugging values into the Website.CalculateCustomerPrice() function and comparing the result to the Unit price column, just like we asked for in the where clause.

In other words, this is happening for every row in Sales.InvoiceLines that has a matching row in Sales.Invoices.

Which is every single invoice & invoice line, as it turns out.

Bonus

If you’re ever looking for an example of a query where collecting an actual execution plan dramatically slows down execution (observe effect), this is a good one. On instances I’ve tested, this query takes 14-18 seconds to run when no plan is collected, and about a minute longer when you enable actual plan collection.

Takeaways

When you’re looking at execution plans, it can be very tricky to spot user defined functions in the plan:

  • You need to not only check into Compute Scalar operators if they are around, but also look carefully at the properties of other operators, including joins
  • If you spot the function call as part of a “probe residual” on a hash join, it is being applied to every row that meets the initial join criteria