Which is Worse: an Unused Index, or an Un-Indexed Foreign Key? (Dear SQL DBA Episode 16)

Should you always index your foreign keys? What if you index them, and then the index never gets used? Watch the 17 minute video, listen to the podcast, or read the episode notes and links below.


Dear SQL DBA,

We recently had a SQL Server performance assessment. It remarked on two things that made me think:

1 ) tables with foreign keys but missing supporting index
2 ) tables with indexes that are not used

But in our case the remark in Case 2 was on a index that supports a foreign key!

Curtain down!

Now to my question, in which cases should you as a rule create indexes to support a foreign key?

Regards,
The Indexer Who Can’t Win

LOL. I love this one!

Clearly, the solution is to write some meaningless queries that use every unused foreign key indexes, right? 😀

Quick background – what’s a foreign key?

foreign-key-parent-child-kendra-little

Foreign keys are a great way to make sure you have valid data in a relational database.

I think of this in terms of parents and children. For example, let’s say I have two tables: dbo.FirstName, and dbo.Students.

The dbo.FirstName table is the parent table. It has two columns, FirstNameId, and FirstName. This parent contains the valid, unique list of FirstNames. The FirstNameId column is defined as a Primary Key column.

The dbo.Students table is the child table. It has many columns. FirstNameId is one of the columns.

I can create a Foreign Key constraint on the child dbo.Students.FirstNameId that references the parent, dbo.FirstNames.FirstNameId.

This means that whenever I insert a row into dbo.Students, SQL Server will check and make sure that the FirstNameId I use is a valid row in the parent table, dbo.FirstNames.

You can get creative with foreign keys:

It’s a good practice to index your foreign keys

To be clear, you don’t have a choice about indexing on the ‘parent’ side of the foreign key. The referenced parent key column(s) has to have an index. (Primary keys and unique constraints are all backed by an index.)

You have a choice about whether to index the columns in the child table – in my example, on dbo.Students.FirstNameId.

I generally do index the child keys, because it’s very common for queries to join between the parent and child columns of a foreign key So there’s a good likelihood you’re going to need an index there.

Erin Stellato wrote a great post on SQLPerformance.com with example code showing why indexing a foreign key can make a big performance difference. It’s called “The Benefits of Indexing Foreign Keys“.

There will be some times when you want to make an exception. (More on this soon.)

What about those ‘unused index’ warnings?

lesser-weevil

Whomever ran your performance assessment didn’t have time to look at this very deeply — or if they did, they didn’t explain it in detail. I can’t really criticize them too much, I’ve had egg on my own face before.

There’s a couple of possibilities about why indexes on your foreign keys might not be used:

  • No queries are being run that join between the parent and child table.
  • Queries are being run, but they don’t like the index because it doesn’t ‘cover’ the query — so they’re doing something else. They could be scanning the base table or using a near-duplicate index.
    • If there is another index that leads on the same keys that is pulling all the weight and covers the foreign key better, the narrow index keyed only on the foreign key columns isn’t adding value.

I would look for the “lesser evil”.

To check if an unused index is really a problem, ask:

  • Is the unused index so large that it takes up significant room in the data file / backups / restores?
  • Does the table have such a high rate of modifications that the index may be a performance problem?

If either of these are a big concern, I’d consider dropping the index on the foreign keys.

However, I would want to monitor it for some time to make sure that there isn’t a workload that really cares about that index which just hasn’t run in a while.

When might you want to not use a simple index on foreign key columns?

There’s always an exception, right?

Here are some examples I can think of when I might not want a simple index on a child key:

  • The foreign key index might help performance significantly if it’s widened to “cover” critical queries. In this case, adding additional key and/or included columns may improve query performance (Fairly common)
  • For tables with very high modifications rates which are performance sensitive, when I know for sure I don’t need to index the foreign keys, they just add overhead and don’t help. You need to be sure you don’t need the foreign key index, or you could run into a major blocking problem because they’re missing, though. (Edge case)
  • When the data in the foreign key columns is very non-selective and it’s a super narrow table, indexing the foreign key may not make a huge performance difference. (Edge case)

I think SQL Server should index foreign keys automatically

I actually think it would be a nice feature for SQL Server to index foreign key columns automatically, as long as it let you specify not to do so, and also let you alter/drop the indexes later.

There’s a fun discussion of this on Greg Low’s blog from 2008 in his post, “Indexing Foreign Keys – Should SQL Server Do that Automatically?”

I feel like this is a “nice to have” feature, but I don’t feel strongly about the SQL Server product team spending time on it. They’ve got other features that I’d like to have more, because in this case I think other features help compensate a bit.

SQL Server’s “missing index” feature isn’t perfect, but it’s pretty darn good at complaining. If you forget to index a foreign key that’s critical to queries or if the index gets dropped, SQL Server is pretty likely to start asking for an index.

So if you’re looking at your missing index warnings, you’re likely to figure out that an index on the foreign key column can speed up your queries (maybe without even realizing it’s related to a foreign key).

I do recommend handling the missing index feature carefully, like it’s a small child who really wants candy in the grocery store checkout line. But that’s a story for another day.

Summing it all up

So, which is Worse: an Unused Index, or an Un-Indexed Foreign Key? Usually an un-indexed foreign key is worse.

Previous Post
Index Tuning Decision Tree for SQL Server
Next Post
Measuring Query Duration: SSMS vs SQL Sentry Plan Explorer

Related Posts

13 Comments. Leave new

  • Hi Kendra,

    great podcast!
    However your strong arguments of the non-issue of unused indexes made me thinking a lot.
    Do you refer to unused foreign key indexes only?

    I am doing performance tuning for three months now (took your great video training at learning.brentozar.com and loved your call for “minimalism” ). I am facing major SAN issues so IO is the real bottleneck. Unfortunately there’s no quick fix. Even as there are much more reads than writes I always try to look at the cost (index update and (offline :-/) maintenance) of an index as well and drop indexes which are much more written to than queried by regularly if I do some performance tuning regarding the table. Keeping these indexes doesn’t feel good because I then would simply be adding stuff (helping queries today) instead of cleaning up the mess created by developers with no indexing knowledge in the past and building a better solution.

    I must admit that I don’t really believe in such a general best practice as to “create a fk index on every table” but I will give it a try and look at the pros and cons on a test system.

    Can you understand my dilemma…what do you think?…maybe I misunderstood the message you tried to tell in your podcast..

    Reply
    • I think I probably have chilled out on unused indexes over the years.

      There’s two things you have to weigh against one another:
      A) The value in reclaiming wasted space and eliminating the overhead of those indexes (which can be considerable for index maintenance, checkdb, and backups)
      B) The risk of dropping an index that is hinted in some code or useful for an important but rare process – and the time it’ll take you to clean up that mess if it’s the case.

      When you’re looking at it at the instance level, if you’ve got 100 x 1 GB indexes that are unused, the cumulative value is pretty darn noticable. I don’t want to discount that. It’s probably worth tracking them and seeing if it’s worth the risk of dropping them.

      But if you’ve got 100 x 1 MB indexes, and those tables aren’t super active, the cumulative value of dropping them is much more negligible, and it’s probably not worth the risk.

      Does that make sense?

      Reply
  • […] Kendra Little answers a question about indexing foreign key constraints: […]

    Reply
  • The illustration only shows one side of the foreign key relationship. If the parent key column is updated or the parent row is deleted, the database has to check the child table for references to the deleted or changed key. So even if there are no queries that would use the foreign key index on the child table, the index could still be important.

    Reply
  • Mike Bilhartz
    December 7, 2016 6:48 am

    I don’t agree with the default practice of indexing FKs. Take, for example, a ROLAPish fact table in a DW with several dozen dimensions: each one of these dimensions is an FK to some kind of dimension/reference/lookup table. Let’s assume for the sake of argument that the business requirement for these FKs has been validated, and that for the purpose intended it is not practical to jam a bunch of them into some kind of junk table.

    Furthermore, assume the reference/dimension values are pretty static: for most of them, the change rate approaches never. Many of them have only a half dozen or so different values. Deletes practically never occur- the business logic doesn’t allow deletes of parents.

    Are you kidding me that the fact table should have several dozen FK indexes on it? Actually, I don’t think you are saying that, but unfortunately I think what you said will be interpreted that way. Please don’t give pat answers, because it suppresses analytical thinking. Here are some things to consider: If there are only very rare changes to the dimension table, the FK constraint will seldom be enforced from parent to child (constraint enforcement from child to parent doesn’t need the index). Parent/Child joins based on some value in the parent- for tables with a handful of records, using a join is extremely inefficient because of the I/O penalty for lookups back to the clustered index. Or IU impact: updating several dozen indexes for every single IU child record can have a crushing impact on performance, and I mean crushing. What if the most efficient clustered index for most purposes is not an int ID? The nonclustered pointers to the clustered data can be pretty wide. Multiply those large keys times several dozen.

    I think it is vital for our gurus to teach analytical thinking before “good practice” (especially good practice that is interpreted as best practice).

    Reply
    • “Please don’t give pat answers, because it suppresses analytical thinking.”

      I would also say, “Don’t throw out generalizations, because life is complicated enough.” Best practices *are* useful, and when we have two conflicting best practices, it’s fair to say which is generally more important. All the details you needed to use to show an exception rather prove the point that it’s an exception and not the rule — but I very much appreciate the time you took to write it up! It would make a terrific blog post.

      Reply
      • Mike Bilhartz
        December 8, 2016 5:30 am

        Thank you, Kendra. One point, though- all those details are quite commonly found together on fact tables in a DW, and might actually be exacerbated by SCD2 ETL processing of more dynamic dimensions. It’s just not particularly rare in OLAP.

        Reply
        • I find in data warehouses it is fairly common for queries to join between parent and child — because a lot of the queries are written by people who aren’t necessarily performance tuners. That may just be the type of DWs that I encounter, though.

          Reply
  • Mike Bilhartz
    December 8, 2016 7:04 am

    True, but the probability (given a few dozen dimensions) of an index on an FK being used my be surprisingly low- the cost of a lookup to the CI data is pretty significant and the execution plan will often just bypass it for a CI scan. It *depends* lol.

    Reply
  • Regardin DWH fact tables and indexing….I think this is what columnstore indexes were made for. However I didn’t feel the urge to try them out yet. If your fact table just serves an SSAS cube and is queried once per night, indexing might not be that important. But as you both said “it depends”. As an accidental dba and beginner level performance tuner this article is really helpful to me.

    Reply
    • Mike Bilhartz
      December 9, 2016 6:49 am

      “If your fact table just serves an SSAS cube and is queried once per night, indexing might not be that important. ”

      No, it would still be important, I think it can be just as important to not have a useless index as it is to have a needed one, at least there is when the number of indexes becomes more than a handfull. There is a big cost associated with useless indexes in terms of CPU, storage, execution speed, etc., and it is pervasive. I have certainly seen overall system performance improve more often from removing indexes than from adding them!

      Just to repeat the single most important principle about nonclustered indexes (that are noncovering): lookups to the clustered index are extremely inefficient, and a nonclustered FK index will probably not be used if the query is expected to return more than ~10-20% of the records in the child table that are randomly distributed on the clustered index.

      (I agree about columnstore indexes, BTW.)

      Reply
  • […] listening to a podcast from Kenda Little (Video to podcast) about Indexing Foreign Keys it got me thinking, how many people put this into practice in their […]

    Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Menu