T-SQL strip all non-alpha and non-numeric characters

后端 未结 5 1786
予麋鹿
予麋鹿 2020-12-03 12:48

Is there a smarter way to remove all special characters rather than having a series of about 15 nested replace statements?

The following works, but only handles thr

相关标签:
5条回答
  • 2020-12-03 13:09

    If you can use SQL CLR you can use .NET regular expressions for this.

    There is a third party (free) package that includes this and more - SQL Sharp .

    0 讨论(0)
  • 2020-12-03 13:20

    One flexible-ish way;

    CREATE FUNCTION [dbo].[fnRemovePatternFromString](@BUFFER VARCHAR(MAX), @PATTERN VARCHAR(128)) RETURNS VARCHAR(MAX) AS
    BEGIN
        DECLARE @POS INT = PATINDEX(@PATTERN, @BUFFER)
        WHILE @POS > 0 BEGIN
            SET @BUFFER = STUFF(@BUFFER, @POS, 1, '')
            SET @POS = PATINDEX(@PATTERN, @BUFFER)
        END
        RETURN @BUFFER
    END
    
    select dbo.fnRemovePatternFromString('cake & beer $3.99!?c', '%[$&.!?]%')
    
    (No column name)
    cake  beer 399c
    
    0 讨论(0)
  • 2020-12-03 13:23

    Create a function:

    CREATE FUNCTION dbo.StripNonAlphaNumerics
    (
      @s VARCHAR(255)
    )
    RETURNS VARCHAR(255)
    AS
    BEGIN
      DECLARE @p INT = 1, @n VARCHAR(255) = '';
      WHILE @p <= LEN(@s)
      BEGIN
        IF SUBSTRING(@s, @p, 1) LIKE '[A-Za-z0-9]'
        BEGIN
          SET @n += SUBSTRING(@s, @p, 1);
        END 
        SET @p += 1;
      END
      RETURN(@n);
    END
    GO
    

    Then:

    SELECT Result = dbo.StripNonAlphaNumerics
    ('My Customer''s dog & #1 friend are dope, yo!');
    

    Results:

    Result
    ------
    MyCustomersdog1friendaredopeyo
    

    To make it more flexible, you could pass in the pattern you want to allow:

    CREATE FUNCTION dbo.StripNonAlphaNumerics
    (
      @s VARCHAR(255),
      @pattern VARCHAR(255)
    )
    RETURNS VARCHAR(255)
    AS
    BEGIN
      DECLARE @p INT = 1, @n VARCHAR(255) = '';
      WHILE @p <= LEN(@s)
      BEGIN
        IF SUBSTRING(@s, @p, 1) LIKE @pattern
        BEGIN
          SET @n += SUBSTRING(@s, @p, 1);
        END 
        SET @p += 1;
      END
      RETURN(@n);
    END
    GO
    

    Then:

    SELECT r = dbo.StripNonAlphaNumerics
    ('Bob''s dog & #1 friend are dope, yo!', '[A-Za-z0-9]');
    

    Results:

    r
    ------
    Bobsdog1friendaredopeyo
    
    0 讨论(0)
  • 2020-12-03 13:23

    I know this is an old thread, but still, might be handy for others. Here's a quick and dirty (Which I've done inversely - stripping out non-numerics) - using a recursive CTE. What makes this one nice for me is that it's an inline function - so gets around the nasty RBAR effect of the usual scalar and table-valued functions. Adjust your filter as needs be to include or exclude whatever char types.

            Create Function fncV1_iStripAlphasFromData (
                @iString Varchar(max)
            )
            Returns 
            Table With Schemabinding
            As
    
                Return(
    
                    with RawData as
                    (
                        Select @iString as iString
                    )
                    ,
                    Anchor as
                    (
    
                        Select Case(IsNumeric (substring(iString, 1, 1))) when 1 then substring(iString, 1, 1) else '' End as oString, 2 as CharPos from RawData
                        UNION ALL
                        Select a.oString + Case(IsNumeric (substring(@iString, a.CharPos, 1))) when 1 then substring(@iString, a.CharPos, 1) else '' End, a.CharPos + 1
                        from RawData r
                        Inner Join Anchor a on a.CharPos <= len(rtrim(ltrim(@iString)))
    
                    )
    
                    Select top 1 oString from Anchor order by CharPos Desc
    
                )
    
    Go
    
    select * from dbo.fncV1_iStripAlphasFromData ('00000')
    select * from dbo.fncV1_iStripAlphasFromData ('00A00')
    select * from dbo.fncV1_iStripAlphasFromData ('12345ABC6789!&*0')
    
    0 讨论(0)
  • 2020-12-03 13:26

    I faced this problem several years ago, so I wrote a SQL function to do the trick. Here is the original article (was used to scrape text out of HTML). I have since updated the function, as follows:

    IF (object_id('dbo.fn_CleanString') IS NOT NULL)
    BEGIN
      PRINT 'Dropping: dbo.fn_CleanString'
      DROP function dbo.fn_CleanString
    END
    GO
    PRINT 'Creating: dbo.fn_CleanString'
    GO
    CREATE FUNCTION dbo.fn_CleanString 
    (
      @string varchar(8000)
    ) 
    returns varchar(8000)
    AS
    BEGIN
    ---------------------------------------------------------------------------------------------------
    -- Title:        CleanString
    -- Date Created: March 26, 2011
    -- Author:       William McEvoy
    --               
    -- Description:  This function removes special ascii characters from a string.
    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    
    
    declare @char        char(1),
            @len         int,
            @count       int,
            @newstring   varchar(8000),
            @replacement char(1)
    
    select  @count       = 1,
            @len         = 0,
            @newstring   = '',
            @replacement = ' '
    
    
    
    ---------------------------------------------------------------------------------------------------
    -- M A I N   P R O C E S S I N G
    ---------------------------------------------------------------------------------------------------
    
    
    -- Remove Backspace characters
    select @string = replace(@string,char(8),@replacement)
    
    -- Remove Tabs
    select @string = replace(@string,char(9),@replacement)
    
    -- Remove line feed
    select @string = replace(@string,char(10),@replacement)
    
    -- Remove carriage return
    select @string = replace(@string,char(13),@replacement)
    
    
    -- Condense multiple spaces into a single space
    -- This works by changing all double spaces to be OX where O = a space, and X = a special character
    -- then all occurrences of XO are changed to O,
    -- then all occurrences of X  are changed to nothing, leaving just the O which is actually a single space
    select @string = replace(replace(replace(ltrim(rtrim(@string)),'  ', ' ' + char(7)),char(7)+' ',''),char(7),'')
    
    
    --  Parse each character, remove non alpha-numeric
    
    select @len = len(@string)
    
    WHILE (@count <= @len)
    BEGIN
    
      -- Examine the character
      select @char = substring(@string,@count,1)
    
    
      IF (@char like '[a-z]') or (@char like '[A-Z]') or (@char like '[0-9]')
        select @newstring = @newstring + @char
      ELSE
        select @newstring = @newstring + @replacement
    
      select @count = @count + 1
    
    END
    
    
    return @newstring
    END
    
    GO
    IF (object_id('dbo.fn_CleanString') IS NOT NULL)
      PRINT 'Function created.'
    ELSE
      PRINT 'Function NOT created.'
    GO
    
    0 讨论(0)
提交回复
热议问题