Perform regex (replace) in an SQL query

后端 未结 6 1461
闹比i
闹比i 2020-12-01 11:08

What is the best way to replace all \'<\' with < in a given database column? Basically perform s/<[^;]/</gi

Notes

相关标签:
6条回答
  • 2020-12-01 11:26

    Very specific to this pattern, but I have done similar to this in the past:

    REPLACE(REPLACE(columName, '&lt;', '&lt'), '&lt', '&lt;')

    broader example (encode characters which may be inappropriate in a TITLE attribute)

    REPLACE(REPLACE(REPLACE(REPLACE(
    REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
        columName
        -- Remove existing encoding:
        , '&amp;', '&')
        , '&#34;', '"')
        , '&#39;', '''')
        -- Reinstate/Encode:
        , '&', '&amp;')
        -- Encode:
        , '"', '&#34;')
        , '''', '&#39;')
        , ' ', '%20')
        , '<', '%3C')
        , '>', '%3E')
        , '/', '%2F')
        , '\', '%5C')
    
    0 讨论(0)
  • 2020-12-01 11:34

    How about:

        UPDATE tableName
        SET columName = REPLACE(columName , '&lt', '&lt;')
        WHERE columnName LIKE '%lt%'
        AND columnName NOT LIKE '%lt;%'
    

    Edit:

    I just realized this will ignore columns with partially correct &lt; strings.

    In that case you can ignore the second part of the where clause and call this afterward:

        UPDATE tableName
        SET columName = REPLACE(columName , '&lt;;', '&lt;')
    
    0 讨论(0)
  • 2020-12-01 11:43

    I think this can be done much cleaner if you use different STUFF :)

    create table test
    (
        id int identity(1, 1) not null,
        val varchar(25) not null
    )
    
    insert into test values ('&lt; <- ok, &lt <- nok')
    
    WHILE 1 = 1
    BEGIN
        UPDATE test SET
            val = STUFF( val , PATINDEX('%&lt[^;]%', val) + 3 , 0 , ';' )
        FROM test
        WHERE val LIKE '%&lt[^;]%'
    
        IF @@ROWCOUNT = 0 BREAK
    END
    
    select * from test
    
    0 讨论(0)
  • 2020-12-01 11:48

    If MSSQL's regex flavor supports negative lookahead, that would be The Right Way to approach this.

    s/&lt(?!;)/&lt;/gi
    

    will catch all instances of &lt which are not followed by a ; (even if they're followed by nothing, which [^;] would miss) and does not capture the following non-; character as part of the match, eliminating the issue mentioned in the comments on the original question of that character being lost in the replacement.

    Unfortunately, I don't use MSSQL, so I have no idea whether it supports negative lookahead or not...

    0 讨论(0)
  • 2020-12-01 11:50

    This article covers how to create a simple Regex Replace function that you can use in SQL 2000 (and 2005 with simple tweak) that can assist you.

    0 讨论(0)
  • 2020-12-01 11:51

    Some hacking required but we can do this with LIKE, PATINDEX, LEFT AND RIGHT and good old string concatenation.

    create table test
    (
        id int identity(1, 1) not null,
        val varchar(25) not null
    )
    
    insert into test values ('&lt; <- ok, &lt <- nok')
    
    while 1 = 1
    begin
        update test
            set val = left(val, patindex('%&lt[^;]%', val) - 1) +
                          '&lt;' +
                          right(val, len(val) - patindex('%&lt[^;]%', val) - 2)
        from test
        where val like '%&lt[^;]%'
    
        IF @@ROWCOUNT = 0 BREAK
    end
    
    select * from test
    

    Better is that this is SQL Server version agnostic and should work just fine.

    0 讨论(0)
提交回复
热议问题