Collation considerations to create robust queries

Is your code "This" robust?
What is Collation? In short words: are the sorting rules, case, and accent sensitivity properties for your data. (if you want to read more, click here).


If you are dedicated to database development or you provide support to a lot of user databases, then you must be aware of the collation of the database at the moment to write your queries to avoid execution failures at deploy stage.


On this post, I will show you some of the most common errors related to collation and how to avoid them to have solid T-SQL queries.


1. Your DEV machine (all defaults)


If you just install SQL Server in your dev machine and use all the defaults, including collation, it is very likely you won't have any issues of this kind, so let us take the following TSQL as an example:

USE School;

--Current database and master 
SELECT name, collation_name FROM sys.databases
where database_id = DB_ID() or database_id =1; 


SELECT DEP.* 
FROM Department dep

It will execute without any issues:


So now, let us change the Collation for the School database:


USE [master];

ALTER DATABASE [School] COLLATE SQL_Latin1_General_CP1_CS_AS;


Collations are different now

After this change, the fun begins...

2. The column prefix '<something>' does not match with a table name or alias name used in the query


Executing the same tsql code, now with a different collation will raise an error:



To solve this, verify that the case on the alias, object and properties are the same, and change it to be the same across all your query:




3. The multi-part identifier "<something>" could not be bound


Our second error, if we add a WHERE clause to the query, again with a different case, an error will show:



To solve this, as the previous error, verify that the case on the alias, object and properties are the same, and change it to be the same across all your query:



4. Cannot resolve the collation conflict between "<Collation 1>" and "<Collation 2>" in the equal to operation


For our third error, we will use a different query that use tables for different databases (and maybe different collations):


SELECT QUOTENAME(SU.[name]) AS UserName 
          ,QUOTENAME(SP.[name]) AS PrincipalName 
FROM sys.sysusers AS SU -- Table located on the database
LEFT JOIN sys.server_principals AS SP -- Table located on [master]
ON SU.[name]  = SP.[name] 
AND SP.[type] = 'S';

Even when we have applied the case rules we explained before, we have another kind of error when databases use different collation:



To solve this, you must apply the same collation to both sides of the join, using COLLATE DATABASE_DEFAULT as you can see here:



One important thing, you only have to do this change where you use multi-database joins, if your joins are in the same database, this is not necessary.


Now you have the knowledge to build more robust T-SQL queries that can be deployed to a wide set of different environments and configurations and you have more probabilities they will not fail.

Comments

Popular posts from this blog

Checking Azure SQL DB service tier via T-SQL

Install Python on Windows

Quick tip: Zoom in Azure Data Studio