What is the best way to replace all \'<\' with <
in a given database column? Basically perform s/<[^;]/</gi
Notes
Very specific to this pattern, but I have done similar to this in the past:
REPLACE(REPLACE(columName, '<', '<'), '<', '<')
broader example (encode characters which may be inappropriate in a TITLE attribute)
REPLACE(REPLACE(REPLACE(REPLACE(
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
columName
-- Remove existing encoding:
, '&', '&')
, '"', '"')
, ''', '''')
-- Reinstate/Encode:
, '&', '&')
-- Encode:
, '"', '"')
, '''', ''')
, ' ', '%20')
, '<', '%3C')
, '>', '%3E')
, '/', '%2F')
, '\', '%5C')
How about:
UPDATE tableName
SET columName = REPLACE(columName , '<', '<')
WHERE columnName LIKE '%lt%'
AND columnName NOT LIKE '%lt;%'
Edit:
I just realized this will ignore columns with partially correct <
strings.
In that case you can ignore the second part of the where clause and call this afterward:
UPDATE tableName
SET columName = REPLACE(columName , '<;', '<')
I think this can be done much cleaner if you use different STUFF :)
create table test
(
id int identity(1, 1) not null,
val varchar(25) not null
)
insert into test values ('< <- ok, < <- nok')
WHILE 1 = 1
BEGIN
UPDATE test SET
val = STUFF( val , PATINDEX('%<[^;]%', val) + 3 , 0 , ';' )
FROM test
WHERE val LIKE '%<[^;]%'
IF @@ROWCOUNT = 0 BREAK
END
select * from test
If MSSQL's regex flavor supports negative lookahead, that would be The Right Way to approach this.
s/<(?!;)/</gi
will catch all instances of < which are not followed by a ; (even if they're followed by nothing, which [^;] would miss) and does not capture the following non-; character as part of the match, eliminating the issue mentioned in the comments on the original question of that character being lost in the replacement.
Unfortunately, I don't use MSSQL, so I have no idea whether it supports negative lookahead or not...
This article covers how to create a simple Regex Replace function that you can use in SQL 2000 (and 2005 with simple tweak) that can assist you.
Some hacking required but we can do this with LIKE, PATINDEX, LEFT AND RIGHT and good old string concatenation.
create table test
(
id int identity(1, 1) not null,
val varchar(25) not null
)
insert into test values ('< <- ok, < <- nok')
while 1 = 1
begin
update test
set val = left(val, patindex('%<[^;]%', val) - 1) +
'<' +
right(val, len(val) - patindex('%<[^;]%', val) - 2)
from test
where val like '%<[^;]%'
IF @@ROWCOUNT = 0 BREAK
end
select * from test
Better is that this is SQL Server version agnostic and should work just fine.