Study your Data: Collation considerations to create robust queries

Monday, June 11, 2018

Collation considerations to create robust queries

Is your code "This" robust?

What is Collation? In short words: are the sorting rules, case, and accent sensitivity properties for your data. (if you want to read more, click here).

If you are dedicated to database development or you provide support to a lot of user databases, then you must be aware of the collation of the database at the moment to write your queries to avoid execution failures at deploy stage.

On this post, I will show you some of the most common errors related to collation and how to avoid them to have solid T-SQL queries.

1. Your DEV machine (all defaults)

If you just install SQL Server in your dev machine and use all the defaults, including collation, it is very likely you won't have any issues of this kind, so let us take the following TSQL as an example:

USE School;

--Current database and master

SELECT name, collation_name FROM sys.databases
where database_id = DB_ID() or database_id =1; 


SELECT DEP.* 
FROM Department dep

It will execute without any issues:

So now, let us change the Collation for the School database:

USE [master];

ALTER DATABASE [School] COLLATE SQL_Latin1_General_CP1_CS_AS;

Collations are different now

After this change, the fun begins...

2. The column prefix '<something>' does not match with a table name or alias name used in the query

Executing the same tsql code, now with a different collation will raise an error:

To solve this, verify that the case on the alias, object and properties are the same, and change it to be the same across all your query:

3. The multi-part identifier "<something>" could not be bound

Our second error, if we add a WHERE clause to the query, again with a different case, an error will show:

To solve this, as the previous error, verify that the case on the alias, object and properties are the same, and change it to be the same across all your query:

4. Cannot resolve the collation conflict between "<Collation 1>" and "<Collation 2>" in the equal to operation

For our third error, we will use a different query that use tables for different databases (and maybe different collations):

SELECT QUOTENAME(SU.[name]) AS UserName 
          ,QUOTENAME(SP.[name]) AS PrincipalName 
FROM sys.sysusers AS SU -- Table located on the database
LEFT JOIN sys.server_principals AS SP -- Table located on [master]
ON SU.[name]  = SP.[name] 
AND SP.[type] = 'S';

Even when we have applied the case rules we explained before, we have another kind of error when databases use different collation:

To solve this, you must apply the same collation to both sides of the join, using COLLATE DATABASE_DEFAULT as you can see here:

One important thing, you only have to do this change where you use multi-database joins, if your joins are in the same database, this is not necessary.

Now you have the knowledge to build more robust T-SQL queries that can be deployed to a wide set of different environments and configurations and you have more probabilities they will not fail.

Study your Data

Monday, June 11, 2018

Collation considerations to create robust queries

1. Your DEV machine (all defaults)

2. The column prefix '<something>' does not match with a table name or alias name used in the query

3. The multi-part identifier "<something>" could not be bound

4. Cannot resolve the collation conflict between "<Collation 1>" and "<Collation 2>" in the equal to operation

No comments:

Post a Comment

Match the Model to the Task, Not the App: An Orchestrator-Worker Pattern in Python