Everything I Never Wanted to Know About Collation (14 minute video/podcast episode)

I’m not an expert on collation in SQL Server, but there are a few important facts which I’ve had to learn the hard way. In this episode I share the basics, along with helpful resources to learn more.

One small localization correction: the default collation on a SQL Server install will be “the oldest available version that’s associated with each specific locale.” So if your Windows installation has a different localization than mine — I always use English (USA) — then you’ll see a different default collation than the one I discuss in the episode. The main point is still the same: it won’t be the recommended collation for *new* development.

Resources:

Prefer to listen on the go? Get this episode on iTunesplay the audio file, or find Dear SQL DBA in your favorite podcast app.

5 Comments. Leave new

  • Thomas Wetzel
    July 20, 2020 11:20 am

    Hi Kendra,

    “This is the default collation” sounds great.

    But it’s even more complex and misleading.

    The default collation of MS SQL Server depends on the Windows locale that is set in your control panel.

    For “German (Germany)” the default collation is e. g. Latin1_General_CI_AS.

    There is a nice overview here: https://docs.microsoft.com/en-us/previous-versions/sql/sql-server-2008-r2/ms143508(v=sql.105)?redirectedfrom=MSDN

    Just to make the confusion perfect.

    Thomas

    Reply
  • Thanks for the talk!

    One thing raising in my mind is “Why the SQL_ collations are obsolete?”
    I would expect they are equal and compatible independently on SQL Server and Windows versions. Can we say that about Windows collations? They may differ from version to version and we are never sure the sorting will work the same way when OS is upgraded… Am I right?

    OK, most of the cases will work but I would bet we may find differences if we would have time to investigate it in all details.

    Reply
  • Alex Friedman
    July 21, 2020 3:10 pm

    A fun aspect of collation is joining between databases in different collations that have to stay that way since they’re huge and have tons of legacy code on them (FML). The devs get a collation mismatch error, throw a random COLLATE somewhere, and F5 happily. Later you get table scans in prod — as COLLATE is a function on the column, after all — and have to figure out the flow of the data to use COLLATE correctly for SARGability. A neat trick I’ve come across for that is to write join conditions twice, each time using a COLLATE on a different side, to support plans going both ways.
    Definitely do everything in your power to have all the collations perfectly aligned everywhere, always.

    Reply

Leave a Reply to Thomas Wetzel Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Menu
%d bloggers like this: