Does Truncate Table Reset Statistics?

on December 8, 2016

Short answer: the SQL Server optimizer will know that the table was truncated, but statistics might not update when you expect.

For the long answer, let’s walk through an example using the WideWorldImporters sample database.

I’ll be using Trace Flag 3604 and 2363 to get SQL Server to print information about how it optimized my query out to the messages tab. (Thanks to Paul White for blogging about this trace flag.)

First, a fresh restore of WideWorldImporters

USE master;
GO

IF DB_ID('WideWorldImporters') IS NOT NULL
ALTER DATABASE WideWorldImporters SET OFFLINE WITH ROLLBACK IMMEDIATE

RESTORE DATABASE WideWorldImporters FROM DISK=
    'S:MSSQLBackupWideWorldImporters-Full.bak'
    WITH REPLACE
GO

USE WideWorldImporters;
GO

Before we do anything, what do the statistics look like on Sales.OrderLines?

Here’s the query that I’m using to inspect the statistics:

SELECT 
    sp.last_updated,
    stat.name as stats_name,
    STUFF((SELECT ', ' + cols.name
        FROM sys.stats_columns AS statcols
        JOIN sys.columns AS cols ON
            statcols.column_id=cols.column_id
            AND statcols.object_id=cols.object_id
        WHERE statcols.stats_id = stat.stats_id and
            statcols.object_id=stat.object_id
        ORDER BY statcols.stats_column_id
        FOR XML PATH(''), TYPE
    ).value('.', 'NVARCHAR(MAX)'), 1, 2, '')  as stat_cols,
    sp.modification_counter,
    sp.rows,
    sp.rows_sampled
FROM sys.stats as stat
CROSS APPLY sys.dm_db_stats_properties (stat.object_id, stat.stats_id) AS sp
JOIN sys.objects as so on 
    stat.object_id=so.object_id
JOIN sys.schemas as sc on
    so.schema_id=sc.schema_id
WHERE 
    sc.name= 'Sales'
    and so.name='OrderLines'
ORDER BY 1 DESC
GO

Statistics were last updated on June 2, 2016. We’ll be mostly looking at the statistic on Quantity throughout the example, so I’ve highlighted it:

statistics-before-changes

Let’s run a query that loads the statistic on Quantity

Before we truncate the table, let’s take a peek into how SQL Server optimizes a query that cares about rows in Sales.OrderLines with Quantity > 10. I’m using trace flags 3604 and 2363 to make SQL Server print information about how it used statistics to optimize this to my messages tab.

SELECT *
FROM Sales.OrderLines
WHERE Quantity > 10
    OPTION
(
    QUERYTRACEON 3604,
    QUERYTRACEON 2363,
    RECOMPILE
)
GO

Here’s the info on the messages tab:

Begin selectivity computation

Input tree:

  LogOp_Select

      CStCollBaseTable(ID=1, CARD=231412 TBL: Sales.OrderLines)

      ScaOp_Comp x_cmpGt

          ScaOp_Identifier QCOL: [WideWorldImporters].[Sales].[OrderLines].Quantity

          ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=10)

Plan for computation:

  CSelCalcColumnInInterval

      Column: QCOL: [WideWorldImporters].[Sales].[OrderLines].Quantity

Loaded histogram for column QCOL: [WideWorldImporters].[Sales].[OrderLines].Quantity from stats with id 7

Selectivity: 0.44231

Stats collection generated: 

  CStCollFilter(ID=2, CARD=102356)

      CStCollBaseTable(ID=1, CARD=231412 TBL: Sales.OrderLines)

End selectivity computation

Estimating distinct count in utility function

Input stats collection:

    CStCollFilter(ID=2, CARD=102356)

        CStCollBaseTable(ID=1, CARD=231412 TBL: Sales.OrderLines)

Columns to distinct on:QCOL: [WideWorldImporters].[Sales].[OrderLines].OrderLineID


Plan for computation:

  CDVCPlanUniqueKey

Result of computation: 102356


(102035 row(s) affected)

Highlights: one of the first thing SQL thinks about is the number of rows in the table

Right at the beginning, we see: “CStCollBaseTable(ID=1, CARD=231412 TBL: Sales.OrderLines)”

That ‘CARD’ number is the optimizer thinking about how many rows are in this table. If you glance back up at the table statistics, the most recent statistic to be updated was on the ‘LastEditedWhen’ column. When that statistic was updated, there were 231,412 rows in the table.

SQL Server decides that it wants detail on the Quantity column to figure out how to run this query, so we see that it loads that statistic up to use: “Loaded histogram for column QCOL: [WideWorldImporters].[Sales].[OrderLines].Quantity from stats with id 7”

Let’s truncate this table

I wipe out all the rows with this command:

TRUNCATE TABLE Sales.OrderLines;
GO

Now, I wouldn’t expect truncating the table to automatically update the statistics.

SQL Server updates statistics when they’re used to optimize a query – so if nobody queries this table for six months, I wouldn’t expect the stats to update for six months.

Let’s re-run our query, trace flags and all:

SELECT *
FROM Sales.OrderLines
WHERE Quantity > 10
    OPTION
(
    QUERYTRACEON 3604,
    QUERYTRACEON 2363,
    RECOMPILE
)
GO

The messages tab has less info this time- it’s much more concise!

Begin selectivity computation

Input tree:

  LogOp_Select

      CStCollBaseTable(ID=1, CARD=1 TBL: Sales.OrderLines)

      ScaOp_Comp x_cmpGt

          ScaOp_Identifier QCOL: [WideWorldImporters].[Sales].[OrderLines].Quantity

          ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=10)

Plan for computation:

  CSelCalcFixedFilter (0.3)

Selectivity: 0.3

Stats collection generated: 

  CStCollFilter(ID=2, CARD=1)

      CStCollBaseTable(ID=1, CARD=1 TBL: Sales.OrderLines)

End selectivity computation


(0 row(s) affected)

SQL Server knows that we blew away all those rows

This time we see “CARD=1 TBL: Sales.OrderLines”

SQL Server doesn’t like to estimate 0 for empty tables. It likes to estimate 1. It knows this table is empty.

With this information, it chooses a different plan for computation. The plan doesn’t require looking at the quantity column this time– we don’t have any lines about that at all.

But the statistics don’t look any different

You might expect to see that the statistic on Quantity had updated. I expected it, before I ran through this demo.

But SQL Server never actually had to load up the statistic on Quantity for the query above. So it didn’t bother to update the statistic. It didn’t need to, because it knows that the table is empty, and this doesn’t show up in our column or index specific statistics.

To verify, I just rerun my metadata query above, and things look the same:

statistics-after-truncate-and-query

What if the table has exactly one row?

Let’s insert one and find out:

INSERT INTO [Sales].[OrderLines] (OrderLineID, OrderID, StockItemID, Description, PackageTypeID, Quantity, UnitPrice, TaxRate, PickedQuantity, PickingCompletedWhen, LastEditedBy, LastEditedWhen)
     VALUES (1, 45, 164, '32 mm Double sided bubble wrap 50m', 7, 50, 112.00, 15.000, 50, '2013-01-02 11:00:00.0000000', 4, '2013-01-02 11:00:00.0000000')
GO

Now we run our familiar query, with all its merry trace flags:

SELECT *
FROM Sales.OrderLines
WHERE Quantity > 10
    OPTION
(
    QUERYTRACEON 3604,
    QUERYTRACEON 2363,
    RECOMPILE
)
GO

And here’s what SQL Server has to say about optimizing that…

Begin selectivity computation

Begin selectivity computation

Input tree:

  LogOp_Select

      CStCollBaseTable(ID=1, CARD=1 TBL: Sales.OrderLines)

      ScaOp_Comp x_cmpGt

          ScaOp_Identifier QCOL: [WideWorldImporters].[Sales].[OrderLines].Quantity

          ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=10)

Plan for computation:

  CSelCalcColumnInInterval

      Column: QCOL: [WideWorldImporters].[Sales].[OrderLines].Quantity

Loaded histogram for column QCOL: [WideWorldImporters].[Sales].[OrderLines].Quantity from stats with id 7

Selectivity: 1

Stats collection generated: 

  CStCollFilter(ID=2, CARD=1)

      CStCollBaseTable(ID=1, CARD=1 TBL: Sales.OrderLines)

End selectivity computation


(1 row(s) affected)

One row is enough to use our column statistic

Looking at the beginning, CARD=1 for Sales.OrderLines, just like it did after we truncated the table. But SQL Server does something different this time, indicating that it now knows that the table isn’t really empty.

It goes back to the CSelCalcColumnInInterval plan to optimize. And it loads up the column stat for the Quantity column.

Since this statistic was loaded into memory, it should have auto-updated based on my database settings. Sure enough, it did:

statistics-after-truncate-and-query-and-insert

SQL Server knows when you’ve truncated a table

And the fact that the table has been truncated may mean that it doesn’t need to use statistics on the table when optimizing queries. After all, it’s an empty table, so it can take shortcuts!

So don’t get too confused if statistics look way out of date for a truncated table. Instead, ask yourself, “why am I querying a truncated table?” (Related disclaimer: I only tested this on SQL Server 2016.)

Want to learn more about statistics in SQL Server? Start here.