Regex pattern inside SQL Replace function?

前端 未结 10 1299
梦谈多话
梦谈多话 2020-11-22 12:42
SELECT REPLACE(\'100.00 GB\', \'%^(^-?\\d*\\.{0,1}\\d+$)%\', \'\');

I want to replace any markup between two

10条回答
  •  南笙
    南笙 (楼主)
    2020-11-22 13:13

    You can use PATINDEX to find the first index of the pattern (string's) occurrence. Then use STUFF to stuff another string into the pattern(string) matched.

    Loop through each row. Replace each illegal characters with what you want. In your case replace non numeric with blank. The inner loop is if you have more than one illegal character in a current cell that of the loop.

    DECLARE @counter int
    
    SET @counter = 0
    
    WHILE(@counter < (SELECT MAX(ID_COLUMN) FROM Table))
    BEGIN  
    
        WHILE 1 = 1
        BEGIN
            DECLARE @RetVal varchar(50)
    
            SET @RetVal =  (SELECT Column = STUFF(Column, PATINDEX('%[^0-9.]%', Column),1, '')
            FROM Table
            WHERE ID_COLUMN = @counter)
    
            IF(@RetVal IS NOT NULL)       
              UPDATE Table SET
              Column = @RetVal
              WHERE ID_COLUMN = @counter
            ELSE
                break
        END
    
        SET @counter = @counter + 1
    END
    

    Caution: This is slow though! Having a varchar column may impact. So using LTRIM RTRIM may help a bit. Regardless, it is slow.

    Credit goes to this StackOverFlow answer.

    EDIT Credit also goes to @srutzky

    Edit (by @Tmdean) Instead of doing one row at a time, this answer can be adapted to a more set-based solution. It still iterates the max of the number of non-numeric characters in a single row, so it's not ideal, but I think it should be acceptable in most situations.

    WHILE 1 = 1 BEGIN
        WITH q AS
            (SELECT ID_Column, PATINDEX('%[^0-9.]%', Column) AS n
            FROM Table)
        UPDATE Table
        SET Column = STUFF(Column, q.n, 1, '')
        FROM q
        WHERE Table.ID_Column = q.ID_Column AND q.n != 0;
    
        IF @@ROWCOUNT = 0 BREAK;
    END;
    

    You can also improve efficiency quite a lot if you maintain a bit column in the table that indicates whether the field has been scrubbed yet. (NULL represents "Unknown" in my example and should be the column default.)

    DECLARE @done bit = 0;
    WHILE @done = 0 BEGIN
        WITH q AS
            (SELECT ID_Column, PATINDEX('%[^0-9.]%', Column) AS n
            FROM Table
            WHERE COALESCE(Scrubbed_Column, 0) = 0)
        UPDATE Table
        SET Column = STUFF(Column, q.n, 1, ''),
            Scrubbed_Column = 0
        FROM q
        WHERE Table.ID_Column = q.ID_Column AND q.n != 0;
    
        IF @@ROWCOUNT = 0 SET @done = 1;
    
        -- if Scrubbed_Column is still NULL, then the PATINDEX
        -- must have given 0
        UPDATE table
        SET Scrubbed_Column = CASE
            WHEN Scrubbed_Column IS NULL THEN 1
            ELSE NULLIF(Scrubbed_Column, 0)
        END;
    END;
    

    If you don't want to change your schema, this is easy to adapt to store intermediate results in a table valued variable which gets applied to the actual table at the end.

提交回复
热议问题