What's a good way to trim all whitespace characters from a string in T-SQL without UDF and without CLR?

前端 未结 2 2039
抹茶落季
抹茶落季 2020-11-27 23:41

The .NET function string.Trim trims a rather extensive set of whitespace characters. How would this exact behavior be emulated in the best way T-SQL?

相关标签:
2条回答
  • 2020-11-27 23:45

    This code provides a pattern that you can extend to handle the whitespace of your choice for a modified LTRIM.

    declare @Tab as NVarChar(1) = NChar( 9 );
    declare @Space as NVarChar(1) = NChar( 32 );
    
    declare @Samples as Table ( String NVarChar(16) );
    insert into @Samples ( String ) values
      ( 'Foo' ),
      ( @Tab + 'Foo' ),
      ( @Space + 'Foo' ),
      ( @Space + @Tab + 'Foo' ),
      ( @Tab + @Space + 'Foo' );
    select String, Len( String ) as [Length], PatIndex( '%[^' + @Tab + @Space + ']%', String ) - 1 as [WhitespaceCount]
      from @Samples;
    

    The REVERSE function can be used to implement a modified version of RTRIM.

    NEWER UPDATE: The following code uses the list of whitespace characters as used in .NET Framework 4. It also works around the feature of LEN not counting trailing blanks.

    declare @Tab as NVarChar(1) = NChar( 9 );
    declare @Space as NVarChar(1) = NChar( 32 );
    
    declare @Samples as Table ( String NVarChar(16) );
    insert into @Samples ( String ) values
      ( 'Foo' ),
      ( @Tab + 'Foo' ),
      ( @Space + 'Foo' ),
      ( @Space + @Tab + 'Foo' ),
      ( @Tab + @Space + 'Foo' ),
      ( @Tab + 'Foo' + @Space ),
      ( @Space + 'Foo' + @Tab ),
      ( @Space + @Tab + 'Foo' + @Tab + @Space ),
      ( @Tab + @Space + 'Foo' + @Space + @Tab ),
      ( 'Foo' + @Tab ),
      ( NULL ),
      ( '           ' ),
      ( @Space + NULL + @Tab + @Tab ),
      ( '' ),
      ( 'Hello world!' );
    
    declare @WhitespacePattern as NVarChar(100) = N'%[^' +
      NChar( 0x0020 ) + NChar( 0x00A0 ) + NChar( 0x1680 ) + NChar( 0x2000 ) +
      NChar( 0x2001 ) + NChar( 0x2002 ) + NChar( 0x2003 ) + NChar( 0x2004 ) +
      NChar( 0x2005 ) + NChar( 0x2006 ) + NChar( 0x2007 ) + NChar( 0x2008 ) +
      NChar( 0x2009 ) + NChar( 0x200A ) + NChar( 0x202F ) + NChar( 0x205F ) +
      NChar( 0x3000 ) + NChar( 0x2028 ) + NChar( 0x2029 ) + NChar( 0x0009 ) +
      NChar( 0x000A ) + NChar( 0x000B ) + NChar( 0x000C ) + NChar( 0x000D ) +
      NChar( 0x0085 ) + N']%';
    -- NB: The   Len   function does not count trailing spaces.
    --     Use   DataLength   instead.
    with AnalyzedSamples as (
      select String, DataLength( String ) / DataLength( NChar( 42 ) ) as [StringLength],
        PatIndex( @WhitespacePattern, String ) - 1 as [LeftWhitespace],
        PatIndex( @WhitespacePattern, Reverse( String ) ) - 1 as [RightWhitespace]
      from @Samples ),
      TrimmedSamples as (
      select String, StringLength, [LeftWhitespace], [RightWhitespace],
        case
          when String is NULL then NULL
          when LeftWhitespace = -1 then N''
          else Substring( String, LeftWhitespace + 1, StringLength - LeftWhitespace )
          end as [LTrim],
        case
          when String is NULL then NULL
          when RightWhitespace = -1 then N''
          else Reverse( Substring( Reverse( String ), RightWhitespace + 1, StringLength - RightWhitespace ) )
          end as [RTrim],
        case
          when String is NULL then NULL
          when LeftWhitespace = -1 then N''
          else Substring( String, LeftWhitespace + 1, StringLength - LeftWhitespace - RightWhitespace )
          end as [Trim]
        from AnalyzedSamples )
      select N'"' + String + N'"' as [String], StringLength, [LeftWhitespace], [RightWhitespace],
        N'"' + [LTrim] + N'"' as [LTrim], DataLength( [LTRIM] ) / DataLength( NChar( 42 ) ) as [LTrimLength],
        N'"' + [RTrim] + N'"' as [RTrim], DataLength( [RTRIM] ) / DataLength( NChar( 42 ) ) as [RTrimLength],
        N'"' + [Trim] + N'"' as [Trim], DataLength( [TRIM] ) / DataLength( NChar( 42 ) ) as [TrimLength]
        from TrimmedSamples;
    
    0 讨论(0)
  • 2020-11-27 23:46

    I'll be interested to see if anyone finds a generic SQL solution.

    The best I can come up with is a simple REPLACE function:

    SELECT MyString = LEFT(MyString, LEN(RTRIM(REPLACE(REPLACE(REPLACE(MyString COLLATE Latin1_General_100_BIN2, NCHAR(9), ' '), NCHAR(12), ' '), NCHAR(13), ' ')))) AS RTrimmed
    
    SELECT MyString = RIGHT(MyString, LEN(LTRIM(REPLACE(REPLACE(REPLACE(MyString COLLATE Latin1_General_100_BIN2, NCHAR(9), ' '), NCHAR(12), ' '), NCHAR(13), ' ')))) AS LTrimmed
    

    etc.

    You can get the list of current whitespace characters here:

    http://unicode.org/charts/uca/chart_Whitespace.html

    Or, to prove it to yourself, you could export a list of all characters from SQL Server to something like Excel, clean the characters, and import them back in. Whatever was removed was whitespace.

    0 讨论(0)
提交回复
热议问题