How to get a case sensitive version of a collation in SQL Server?

后端未结

关注

 1  1987

孤城傲影 2021-01-21 08:32

Is there a way to get a case-sensitive version of a collation to use in a query?

Let\'s say that the query could be used on databases with different collations, some whi

1条回答

抹茶落季 (楼主)

2021-01-21 09:13

Collations are not necessarily determined by the Database default value: they can be set per string field as well.

No, I have never seen a way (and I have looked) to do dynamic collations outside of using Dynamic SQL to write the COLLATE clause into a query. Or, if the number of options you need to account for are fairly minimal, you could maybe try something like the following:

SELECT ...
FROM   ...
WHERE (@CaseSensitive = 1 AND [Field] LIKE N'%' + @Name + N'%' COLLATE Something_CS_AS)
OR (@CaseSensitive = 0 AND [Field] LIKE N'%' + @Name + N'%')

Also, there is no direct equivalence between Case (or even Accent, Kana, or Width) sensitive and insensitive. While most of the time there is a case-sensitive counterpart to a case-insensitive collation, there are 15 collations that are case-insensitive-only:

;WITH CaseS AS
(
  SELECT [name]
  FROM   sys.fn_helpcollations()
  WHERE  [name] LIKE N'%[_]cs[_]%'
)
SELECT CaseI.*
FROM   sys.fn_helpcollations() CaseI
LEFT JOIN CaseS
       ON CaseI.name = REPLACE(CaseS.[name], N'_CS_', N'_CI_')
WHERE  CaseI.[name] LIKE N'%[_]ci[_]%'
AND    CaseS.[name] IS NULL;

Returns:

name                                  description
SQL_1xCompat_CP850_CI_AS              ...
SQL_AltDiction_CP850_CI_AI            ...
SQL_AltDiction_Pref_CP850_CI_AS       ...
SQL_Danish_Pref_CP1_CI_AS             ...
SQL_Icelandic_Pref_CP1_CI_AS          ...
SQL_Latin1_General_CP1_CI_AI          ...
SQL_Latin1_General_CP1253_CI_AI       ...
SQL_Latin1_General_CP437_CI_AI        ...
SQL_Latin1_General_CP850_CI_AI        ...
SQL_Latin1_General_Pref_CP1_CI_AS     ...
SQL_Latin1_General_Pref_CP437_CI_AS   ...
SQL_Latin1_General_Pref_CP850_CI_AS   ...
SQL_Scandinavian_Pref_CP850_CI_AS     ...
SQL_SwedishPhone_Pref_CP1_CI_AS       ...
SQL_SwedishStd_Pref_CP1_CI_AS         ...

Fixing a collation like this in the query could cause problems when migrating the code,

Why? Where are you planning on migrating the code to? If to another RDBMS, then you already need to contend with datatype differences, SQL dialect differences, "best practices" differences, etc. So why worry about collations? Unless you know for certain that you will be migrating to another RDBMS, you should make your system work as best as it can by using your current platform to the best of its abilities, rather than existing in a less-than-optimal state due to only using lowest-comment-denominator functionality.

or changing the DB collation at a latter date.

Why would you do this? Again, any string fields with an explicit COLLATION setting are not affected by the database default.

If you are looking for strict Case (and everything including Accent, etc) sensitivity on equivalence (we are not talking about range searches or sorting), then you can use a Binary collation (i.e. one ending in either _BIN or _BIN2). Just keep in mind that binary collations might not sort the way you might expect since they are not "dictionary" based sorts, at least not in terms of a single binary collation that would behave the same across all languages. They also don't make equivalences between languages (i.e. equating "a" with an "a" that has an accent).

Since the original posting of this answer I have discovered that the paragraph above is actually bad advice. Please do not use a binary collation if the goal is case-sensitivity. It is too strict and in many cases will not give accurate results. For details and examples, please see: No, Binary Collations are not Case-Sensitive.

Also, please do not use binary collations ending in just _BIN as they have been obsolete since SQL Server 2005 was released and should only be used when needing to maintain backwards compatibility with another system that is also using a _BIN collation. If you need a binary collation, use one ending in _BIN2. For details and examples, please see: Differences Between the Various Binary Collations (Cultures, Versions, and BIN vs BIN2).

UPDATE

I was able to come up with a function to get the case sensitive version, if one exists, of the passed-in collation. This function, however, will only assist in creating the correct Dynamic SQL; it cannot be used inline in a query to set the COLLATE clause dynamically (mainly because that cannot be done). There are two parameters:

@CollationName -- if you pass this in, you will get back the case-sensitive version of it, if one exists. The @DatabaseName param will be ignored.
@DatabaseName -- if you don't know the exact collation, leave @CollationName as NULL and pass this in and it will look up the default collation for that database.
If both params are NULL then it will look up the default collation for the database that the function exists in.
If the passed-in or looked-up collation is already case-sensitive, that name will be returned
TO DO (when I have time): look up server default collation for databases that do not have a default (they will have NULL as their default collation name)

There are two versions of the function: the first is a TVF (as those are faster) and a Scalar UDF (as those are sometimes easier to interact with).

Table-Valued Function:

USE [Test];
SET ANSI_NULLS ON;

IF (OBJECT_ID(N'dbo.GetCaseSensitiveCollation') IS NOT NULL)
BEGIN
  DROP FUNCTION dbo.GetCaseSensitiveCollation;
END;

GO
CREATE FUNCTION dbo.GetCaseSensitiveCollation
(
  @CollationName sysname,
  @DatabaseName sysname
)
RETURNS TABLE
--WITH SCHEMABINDING
--     Cannot schema bind table valued function 'dbo.GetCaseSensitiveCollation'
--     because it references system object 'sys.fn_helpcollations'.
AS RETURN

  WITH collation(name) AS
  (
    SELECT CONVERT(sysname, COALESCE(@CollationName,
                DATABASEPROPERTYEX(COALESCE(@DatabaseName, DB_NAME()), 'Collation')))
  )
  SELECT col.[name]
  FROM   sys.fn_helpcollations() col
  CROSS JOIN collation
  WHERE  col.[name] = CASE WHEN collation.[name] LIKE N'%[_]CS[_]%' 
                               THEN collation.[name]
                           ELSE REPLACE(collation.[name], N'_CI_', N'_CS_')
                      END;
GO

Examples:

-- Get CS Collation for the specified Collation
SELECT [name] AS [BySpecificCollation]
FROM dbo.GetCaseSensitiveCollation(N'Indic_General_100_CI_AS_KS_WS', NULL);

-- Get CS Collation based on database default for the specified database
SELECT [name] AS [ByDefaultCollationForDB]
FROM dbo.GetCaseSensitiveCollation(NULL, N'msdb');

-- Get CS Collation based on database default for database that the function exists in
SELECT [name] AS [CurrentDB]
FROM Test.dbo.GetCaseSensitiveCollation(NULL, NULL);

-- Get CS Collation based on database default for the current database
USE [ReportServer];
SELECT [name] AS [CurrentDB]
FROM Test.dbo.GetCaseSensitiveCollation(NULL, DB_NAME());

Scalar User-Defined Function:

USE [Test];
SET ANSI_NULLS ON;

IF (OBJECT_ID(N'dbo.GetCaseSensitiveCollation2') IS NOT NULL)
BEGIN
  DROP FUNCTION dbo.GetCaseSensitiveCollation2;
END;
GO
CREATE FUNCTION dbo.GetCaseSensitiveCollation2
(
  @CollationName sysname,
  @DatabaseName sysname
)
RETURNS sysname
--WITH SCHEMABINDING
--     Cannot schema bind table valued function 'dbo.GetCaseSensitiveCollation2'
--     because it references system object 'sys.fn_helpcollations'.
AS
BEGIN
  DECLARE @NewCollationName sysname;

  ;WITH collation(name) AS
  (
    SELECT CONVERT(sysname, COALESCE(@CollationName,
                DATABASEPROPERTYEX(COALESCE(@DatabaseName, DB_NAME()), 'Collation')))
  )
  SELECT @NewCollationName = col.[name]
  FROM   sys.fn_helpcollations() col
  CROSS JOIN collation
  WHERE  col.[name] = CASE WHEN collation.[name] LIKE N'%[_]CS[_]%'
                                THEN collation.[name]
                           ELSE REPLACE(collation.[name], N'_CI_', N'_CS_')
                      END;

  RETURN @NewCollationName;
END;
GO

Examples:

/* Get CS Collation for the specified Collation */
SELECT dbo.GetCaseSensitiveCollation2(N'Indic_General_100_CI_AS_KS_WS', NULL)
                 AS [BySpecificCollation];
-- Indic_General_100_CS_AS_KS_WS

/* Get CS Collation based on database default for the specified database */
SELECT dbo.GetCaseSensitiveCollation2(NULL, N'msdb') AS [ByDefaultCollationForDB];
-- SQL_Latin1_General_CP1_CS_AS

/* Get CS Collation based on database default for the current database */
USE [ReportServer];
SELECT Test.dbo.GetCaseSensitiveCollation2(NULL, DB_NAME()) AS [CurrentDB];
-- Latin1_General_CS_AS_KS_WS

/* Get CS Collation based on database default for database where the function exists */
SELECT Test.dbo.GetCaseSensitiveCollation2(NULL, NULL) AS [DBthatFunctionExistsIn];
-- SQL_Latin1_General_CP1_CS_AS

0 讨论(0)